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GLYCOSIDASE ENZYMES 
BACKGROUND OF THE INVENTION 

1 . Field of the Inventions 

This invention relates to newly identified polynucleotides, polypeptides encoded by 
such polynucleotides, the use of such polynucleotides and polypeptides, as well as the 
production and isolation of such polynucleotides and polypeptides. More particularly, the 
polynucleotides and polypeptides of the present invention has been putatively identified as 
glucosidases. a-galactosidases, p-galactosidases, B-mannosidases, 13-mannanases, 
endoglucanases, and pullalanases. 

2. Description of Related An 

The glycosidic bond of p-galactosides can be cleaved by different classes of 
enzymes: (i) phospho-P-galactosidases (EC3.2.1.85) are specific for a phosphorylated 
substrate generated via phosphoenolpyruvate phosphotransferase system (PTS)-dependent 
uptake; (ii) typical p-galactosidases (EC 3.2.1.23), represented by the Escherichia coli LacZ 
enzyme, which are relatively specific for p-galactosides; and (iii) p-glucosidases (EC 
3.2.1.21) such as the enzymes oi Agrobacterium faecalis. Clostridium thermocellum, 
Pyrococcus furiosus or Sulfolobus solfataricus (Day, A.G. and Withers, S.G., (1986) 
Purification and characterization of a P-glucosidase from Alcaligenes faecalis. Can. J. 
Biochem. Cell. Biol. 64, 914-922; Kengen, S.W.M., et al. (1993) Eur. J. Biochem., 213, 
305-3 12; Ait, N., Cruezet, N. and Cattaneo, J. (1982) Properties of p-glucosidase purified 
from Clostridium thermocellum. J. Gen. Microbiol. 128, 569-577; Grogan, D.W. (1991) 
Evidence that p-galactosidase of Sulfolobus solfataricus is only one of several activities of 
a thermostable p-D-glycodiase. Appl. Environ. Microbiol. 57, 1644-1649). Members of 
the latter group, although highly specific with respect to the P-anomeric configuration of 
the glycosidic linkage, often display a rather relaxed substrate specificity and hydrolyze P- 
glucosides as well as P-fucosides and P-galactosides. 

1 
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Generally, a-galactosidases are enzymes that catalyze the hydrolysis of galactose 
groups on a polysaccharide backbone or hydrolyze the cleavage of di- or oligosaccharides 
comprising galactose. 

Generally, IJ-mannanases are enzymes that catalyze the hydrolysis of mannose 
groups internally on a polysaccharide backbone or hydrolyze the cleavage of di- or 
oligosaccaharides comprising mannose groups. B-mannosidases hydrolyze non-reducing, 
terminal mannose residues on a mannose-containing polysaccharide and the cleavage of di- 
or oligosaccaharides comprising mannose groups. 

Guar gum is a branched galactomannan polysaccharide composed of P-1,4 linked 
mannose backbone with a-1 ,6 linked galactose side chains. The enzymes required for the 
degradation of guar are P-mannanase, P-mannosidase and a-galactosidase. p-mannanase 
hydrolyses the mannose backbone internally and P-mannosidase hydrolyses non-reducing, 
terminal mannose residues, a-galactosidase hydrolyses a-linked galactose groups. 

Galactomannan polysaccharides and the enzymes that degrade them have a variety 
of applications. Guar is commonly used as a thickening agent in food and is utilized in 
hydraulic fracturing in oil and gas recovery. Consequently, galactomannanases are 
industrially relevant for the degradation and modification of guar. Furthermore, a need 
exists for thermostable galactomannases that are active in extreme conditions associated 
with drilling and well stimulation. 

There are other applications for these enzymes in various industries, such as in the 
beet sugar industry'. 20-30% of the domestic U.S. sucrose consumption is sucrose from 
sugar beets. Raw beet sugar can contain a small amount of raffmose when the sugar beets 
are stored before processing and rotting begins to set in. Raffinose inhibits the 
crystallization of sucrose and also constitutes a hidden quantity of sucrose. Thus, there is 
merit to eliminating raffmose from raw beet sugar. a-Galactosidase has also been used as 
a digestive aid to break down raffmose, stachyose, and verbascose in such foods as beans 

and other gassy foods. 

p-galactosidases which are active and stable at high temperatures appear to be 
superior enzymes for the production of lactose-free dietary milk products (Chaplin, M.F. 



wo 98/24799 



PCT/US97/22623 



and Bucke, C, (1990) In: Enzyme Technology, pp. 159-160, Cambridge University Press, 
Cambridge, UK). Also, several studies have demonstrated the applicability of P- 
galactosidases to the enzymatic synthesis of oligosaccharides via transglycosylation 
reactions (Nilsson, K.G.L (1988) Enzymatic synthesis of oligosaccharides. Trends 
Biotechnol. 6, 156-264; Cote, GX. and Tao, B.Y. (1990) Oligosaccharide synthesis by 
enzymatic transglycosylation. Glycoconjugate J. 7, 145-162). Despite the commercial 
potential, only a few P-galactosidases of thermophiles have been characterized so far. Two 
genes reported are P-galactoside-cleaving enzymes of the hyperthermophilic bacterium 
Thermotoga maritima, one of the most thermophilic organotrophic eubacteria described to 
date (Huber, R., Langworthy, T.A., Konig, H., Thomm, M,, Woese, C.R., Sleytr, U.B. and 
Stetter, K.O. (1986) T. martima sp. nov. represents a new genus of unique extremely 
thermophilic eubacteria growing up to 90''C, Arch. Microbiol, 144, 324-333) one of the 
most thermophilic organotrophic eubacteria described to date. The gene products have been 
identified as a P-galactosidase and a P-glucosidase. 

Pullulanase is well known as a debranching enzyme of puUulan and starch. The 
enzyme hydrolyzes a-l,6-glucosidic linkages on these polymers. Starch degradation for 
the production or sweeteners (glucose or maltose) is a very important industrial application 
of this enzvTne. The degradation of starch is developed in two stages. The first stage 
involves the liquefaction of the substrate with a-amylase, and the second stage, or 
saccharification stage, is performed by B-amylase with puUalanase added as a debranching 
enzyme, to obtain better yields. 

Endoglucanases can be used in a variety of industrial applications. For instance, the 
endoglucanases of the present invention can hydrolyze the internal B-l,4-glycosidic bonds 
in cellulose, which may be used for the conversion of plant biomass into fuels and 
chemicals. Endoglucanases also have applications in detergent formulations, the textile 
industry, in animal feed, in waste treatment, and in the fruit juice and brewing industry for 
the clarification and extraction of juices. 
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Brief Description of the Drawings 

The following drawings are illustrative of embodiments of the invention and are not 
meant to limit the scope of the invention as encompassed by the claims. 

Figures la-b are the full-length DNA and corresponding deduced amino acid 
sequence of MllTL of the present invention. Sequencing was performed using a 378 
automated DNA sequencer for all sequences of the present invention (Applied Biosystems, 
Inc.). 

Figure 2 is an illustration of the full-length DNA and corresponding deduced amino 
acid sequence of OC1/4V-33B/G. 

Figure 3 is an illustration of the full-length DNA and corresponding deduced amino 
acid sequence of F1-12G. 

Figures 4a-b are the full-length DNA and corresponding deduced amino acid 
sequence of 9N2-3 IB/G. 

Figures 5a-b are the full-length DNA and corresponding deduced amino acid 

sequence of MSB8-6G. 

Figure 6 is the full-length DNA and corresponding deduced amino acid sequence 

ofAEDni2RA-18B/G. 

Figures 7a-b are the full-lengdi DNA and corresponding deduced amino acid 

sequence of GC74-22G. 

Figures 8a-b are the full-length DNA and corresponding deduced amino acid 

sequence of VC1-7G1. 

Figures 9a-c are the full-length DNA and corresponding deduced amino acid 

sequence of 37GPI. 

Figures lOa-c are the full-length DNA and corresponding deduced amino acid 
sequence of 6GC2. 

Figures lla-d are the full-length DNA and corresponding deduced amino acid 
sequence of 6GP2. 

Figures 12a-c are the full-length DNA and corresponding deduced amino acid 
sequence of 63GB1. 
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Figures 13a-b are the flill-Iength DNA and corresponding deduced amino acid 
sequence of 0C1/4V. 

Figures 14a-e are the full-length DNA and corresponding deduced amino acid 
sequence of6GP3. 

Figures 15a-d are the full-length DNA and conesponding deduced amino acid 

sequence of Thermotoga maritima MSB8-6GP2, 

Figures 16a-c are the full-length DNA and corresponding deduced amino acid 
sequence of Thermotoga maritima MSB8-6GB4. 

Figures 17a-d are the full-length DNA and corresponding deduced amino acid 
sequence of Banki gouldi 37GP4. 

Figures 18a-b are the full-length DNA and corresponding deduced amino acid- 
sequence of Pyrococcus furiosus VC1-7EGI. 

SUMMARY OF THE INVENTION 

In a preferred embodiment of the present invention, there are provided isolated 
nucleic acids (polynucleotides) which encode mature enzymes having the deduced amino 
acid sequences ofFigures 1-18 (SEQ ID NOS: 15-28 and 61-64). 

In another embodiment, the invention provides a method for producing a 
polypeptide including culturing host cells containing the polynucleotide ofFigures 1-18 and 
expressing from the host cell a polypeptide encoded by the polynucleotide and isolating the 
polypeptide. 

In another embodiment, the invention provides an enzyme selected from the group 
consisting of an enzyme having an amino acid sequence set forth in SEQ ID NOS: 15-28 
or 61-64 and an enzyme which has at least 30 consecutive amino acid residue as an enzyme 
having an amino acid sequence set forth in SEQ ID NOS: 15-28 or 61-64. 

In yet another embodiment, the invention provides a method for generating glucose 
from soluble cell oligosaccharides which includes contacting a sample containing 
oligosaccharides with an effective amount of an enzyme selected from the group of 
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enzymes having the amino acid sequence set forth in SEQ ID NOS: 15-28, 61-63 and 64 

such that glucose is produced 

The publications discussed herein are provided solely for their disclosure prior to 

the filing date of the present application. Nothing herein is to be construed as an 

admission that the invention is not entitled to antedate such disclosure by virtue of prior 

invention. 

Definitions 

"Monosaccharide", as used herein, refers to a single polyhydroxy aldehyde or 
ketone unit. 

''Oligosaccharide", as used herein, consist of short chains of monosaccharide units 
joined together by covalent bonds. Of these, the most abundant are the disaccharides, 
which have two monosaccharide units. 

"Polysaccharide", as used herein, consists of long chains having many 
monosaccharide units. 

The term "gene" means the segment of DNA involved in producing a polypeptide 
chain; it includes regions preceding and following the coding region (leader and trailer) as 
well as intervening sequences (introns) between individual coding segments (exons). 

A coding sequence is "operably linked to" another coding sequence when RNA 
polymerase will transcribe the two coding sequences into a single mRNA, which is then 
translated into a single polypeptide having amino acids derived from both coding 
sequences. The coding sequences need not be contiguous to one another so long as the 
expressed sequences ultimately process to produce the desired protein. 

"Recombinant" enzymes refer to enzymes produced by recombinant DNA 
techniques; i.e., produced from cells transformed by an exogenous DNA construct encoding 
the desired enzyme. "Synthetic" enzymes are those prepared by chemical synthesis. 

A DNA "coding sequence of or a "nucleotide sequence encoding" a particular 
enzyme, is a DNA sequence which is transcribed and translated into an enzyme when 
placed under the control of appropriate regulatory sequences. 
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Detailed Description of the Invention 

The polynucleotides and polypeptides of the present invention have been identified 
as glucosidases, a-galactosidases. P-galactosidases, 13-mannosidases, 0-mannanases, 
endoglucanases, and pullalanases as a result of their enzymatic activity. 

In accordance with one aspect of the present invention, there are provided novel 
enzymes, as well as active fragments, analogs and derivatives thereof. 

In accordance with another aspect of the present invention, there are provided 
isolated nucleic acid molecules encoding the enzymes of the present invention including 
mRNAs, cDNAs, genomic DNAs as well as active analogs and fragments of such enzymes. 

In accordance with yet a further aspect of the present invention, there is provided 
a process for producing such polypeptides by recombinant techniques comprising culturing 
recombinant prokaryotic and/or eukaryotic host cells, containing a nucleic acid sequence 
of the present invention, under conditions promoting expression of said enzymes and 
subsequent recovery of said enzymes. 

In accordance with yet a further aspect of the present invention, there is provided 
a process for utilizing such enzymes, or polynucleotides encoding such enzymes for 
hydro lyzing lactose to galactose and glucose for use in the food processing industry, the 
pharmaceutical industry, for example, to treat intolerance to lactose, as a diagnostic reporter 
molecule, in com wet milling, in the fruit juice industry, in baking, in the textile industry 
and in the detergent industry. 

In accordance with yet a further aspect of the present invention, there is provided 
a process for utilizing such enzymes for hydrolyzing guar gum (a galactomannan 
polysaccharide) to remove non-reducing terminal mannose residues. Further 
polysaccharides such as galactomannan and the enzymes according to the invention that 
degrade them have a variety of applications. Guar gum is commonly used as a thickening 
agent in food and also is utilized in hydraulic fracturing in oil and gas recovery. 
Consequently, mannanases are industrially relevant for the degradation and modification 
of guar gums. Furthermore, a need exists for thermostable mannases that are active in 
extreme conditions associated with drilling and well stimulation. 
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In accordance with yet a further aspect of the present invention, there are also 
provided nucleic acid probes comprising nucleic acid molecules of sufficient length to 
specifically hybridize to a nucleic acid sequence of the present invention. 

In accordance with yet a further aspect of the present invention, there is provided 
a process for utilizing such enzymes, or polynucleotides encoding such enzymes, for in 
vitro purposes related to scientific research, for example, to generate probes for identifying 
similar sequences which might encode similar enzymes from other organisms by using 
certain regions, i.e.. conserved sequence regions, of the nucleotide sequence. 

These and odier aspects of the present invention should be apparent to those skilled 
in the art from the teachings herein. 

The polynucleotides of this invention were originally recovered from genomic gene 
libraries derived from the following organisms: 

MUTL is a new species of Desulfurococcus isolated from Diamond Pool in 
Yellowstone National Park. The organism grows optimally at 85-88 °C, pH 7.0 in a low salt 
medium containing yeast extract, peptone, and gelatin as substrates with a N./CO, gas 
phase. 

0C1/4V is from the genus Thermologa. The organism was isolated from 
Yellowstone National Park. It grows optimally at 75°C in a low salt medium with cellulose 
as a substrate and N2 in gas phase. 

Pyrococcus furiosus VCl and (7EG1) is from the genus Pyrococcus. VCl was 

isolated from Vulcano, Italy. It grows optimally at 100°C in a high salt medium (marine) 
containing elemental sulfur, yeast extract, peptone and starch as substrates and N. in gas 
phase. 

Staphylothermus marinus Fl is a from the genus Staphylothermus. Fl was isolated 
from Vulcano, Italy. It grows optimally at 85 X, pH 6.5 in high salt medium (marine) 
containing elemental sulfur and yeast extract as substrates and N, in gas phase. 
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Thermococcus 9N-2 is from the genus Thermococcus 9N-2 was isolated from 
diffuse vent fluid in the East Pacific Rise. It is a strict anaerobe that grows optimally at 
87°C 

Thermotoga mandma MSB8 and MSB8 (Clone # 6GP2 and 6GB4) is from the 

genus Thermotogo, and was isolated from Vulcano, Italy. MSB8 grows optimally at 85 °C, 
pH 6.5 in a high salt medium (marine) containing starch and yeast extract as substrates and 
in gas phase. 

Thermococcus alcaliphilus AEDII12RA is from the genus Thermococcus, 
AEDII12RA grows optimally at 85 °C, pH 9.5 in a high salt medium (marine) containing 
polysulfides and yeast extract as substrates and N-, in gas phase. 

Thermococcus chitonophagiis GC74 is from the genus Thermococcus, GC74 grows 
optimally at 85 °C, pH 6.0 in a high salt medium (marine) containing chitin, meat extract, 
elemental sulfur and yeast extract as substrates and in gas phase. AEPII la grows 
optimally at 85°C at pH 6.5 in marine medium under anaerobic conditions. It has many 
substrates. Bankia gouldi is from the genus Bankia. 

Accordingly, the polynucleotides and enzymes encoded thereby are identified by 
the organism from which they were isolated, and are sometimes hereinafter referred to as 
"Ml ITL" (Figure 1 and SEQ ID N0S:1 and 15), "OC1/4V-33B/G" (Figure 2 and SEQ ID 
N0S:2 and 16), "F1-I2G" (Figure 3 and SEQ ID N0S:3 and 17), "9N2-31B/G" (Figure 4 
and SEQ ID N0S:4 and 18), "MSB8" (Figure 5 and SEQ ID N0S:5 and 19), "AEDII12RA- 
18B/G" (Figure 6 and SEQ ID N0S:6 and 20), "GC74-22G" (Figure 7 and SEQ ID N0S:7 
and 21), "VC1-7G1" (Figure 8 and SEQ ID N0S:8 and 22), "37GP1" (Figure 9 and SEQ 
ID NOS: 9 and 23), "6GC2" (Figure 10 and SEQ ID NOS: 10 and 24), "6GP2" (Figure 1 1 
and SEQ ID N0S:11 and 25), "AEPII la" (Figure 12 and SEQ ID N0S:12 and 26), 
"0C1/4V" (Figure 13 and SEQ ID N0S:13 and 27), and "6GP3" (Figure 14 and SEQ ID 
NOS:28), "MSB8-6GP2" (Figure 15 and SEQ ID NOS:57 and 61), "MSB8-6GB4"(Figure 
16 and SEQ ID NOS:58 and 62),"VCl-7EGr'(Figure 17 and SEQ ID NOS:59 and 63), 
and 37GP4 (Figure 18 and SEQ ID NOS:60 and 64). 
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The polynucleotides and polypeptides of the present invention show identity at the 
nucleotide and protein level to known genes and proteins encoded thereby as shown in 
Table 1. 

Table 1 



Clone 


Gene/Protein with 
Closest Homology 


Protein 
Identity 


Nucleic 
Acid 
Identity 


MI1TL-29G 


Sulfolobus sulfataricus 
DSM 1616/PK B- 
galactosidase 


51% 


55% 


i/4 V-J JD/VJ 


r'aldocellum 

V^CtlVAw\>>^v i 1 will 

saccharolyticum, P- 

elucosidase 

__________ 


52% 


57% 


Staphylothermus 
marinus F1-12G 


Bacillus polymyxa, P- 
galactosidase 


36% 


48% 


Thermococcus 9N2- 
31B/G 


Sulfolobus sulfataricus 
ATCC 49255/MT4, p- 
salactosidase 


51% 


50% 


Thermotoga maritima 
MSB8-6G 


Clostridium thermocellum 
bdB 


45% 


53% ■ 


Thermococcus 
AEDni2RA-18B/G 


Bacillus polymyxa, P- 
galactosidase 


34% 


48% 


Thermococcus 
chitonophagus GC74- 
22G 


Sulfolobus sulfataricus. 
ATCC 49255/MT4, P- 
aalactosidase 


46% 


54% 
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Pyrococcus furlosus 
VC1-/CjI 


Sulfolobus 
suiiataricus/M I -4 p- 
galactosidase 


46.4% 


52.5% 


1 nermotoga maritima 
a-galactosidase 


Pediococcus pentosaceaus 
a-galactosidase 




TOO/ 


Thermotoga maritima 
B-mannanase f6GP2) 


Aspergillus aculeatus 
mannanase 


56% 


37% 


AEPII laiJ- 

mannosidase C63GB1) 


Sulfolobus solfactancus 13- 
galactosidase 


78% 


56% 


0C1/4V 

endoglucanase 

(33GP1) 


Clostridium thermocellum 
endo- 1 ,4-13-endoglucanase 


65% 


43% 


Thermotoga maritima 
pullalanase (6GP3) 


Caidocellum 
saccharolyticum a- 

glucanohydralase 


72 


53 


Bankia goiildi mix 

Endoglucanase 

f37GPn 


None available 







The polynucleotides and enzymes of the present invention show homology to each 
other as shown in Table 2. 
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Table 2 



Clone 


uene/rroicin wuri 
Closest Homology 


Protein 
Identity 


Nucleic 

Acid 
Identity 


Staphylothermus 
mahmis F1-12G 


Thermococcus 
AEDII12RA-18B/G, p- 
ealactosidase, clucosidase 


55% 


57% 


Thermococcus 9N2- 
31B/G 


Thermococcus 
chitonophagus GC74- 
22G-Rlucosidase' 


74% 


66% 


Pyrococcus furiosus 
VC1-7G1 


Pyrococcus furiosus VCl- 
7B/G P-galactosidase 


46.4% 


54% 



All the clones identified in Tables 1 and 2 encode polypeptides which have a- 
glycosidase or P-glycosidase activity. 

This invention, in addition to the isolated nucleic acid molecules encoding the 
enzymes of the present invention, also provide substantially similar sequences. Isolated 
nucleic acid sequences are substantially similar if: (i) they are capable of hybridizing under 
conditions hereinafter described, to the polynucleotides of SEQ ID NOS: 1-14 and 57-60; 
(ii) or they encode DNA sequences which are degenerate to the polynucleotides of SEQ ID 
NOS: 1-14 and 57-60. Degenerate DNA sequences encode the amino acid sequences of 
SEQ ID NOS: 15-28 and 61-64, but have variations in the nucleotide coding sequences. As 
used herein, substantially similar refers to the sequences having similar identity to the 
sequences of the instant invention. The nucleotide sequences that are substantially the same 
can be identified by hybridization or by sequence comparison. Enzyme sequences that are 
substantially the same can be identified by one or more of the following: proteolytic 
digestion, gel electrophoresis and/or microsequencing. 

One means for isolating the nucleic acid molecules encoding the enzymes of the 
present invention is to probe a gene library with a natural or artificially designed probe 
using art recognized procedures (see, for example: Current Protocols in Molecular Biology, 
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Ausubel F.M, etai (EDS.) Green Publishing Company Assoc. and John Wiiey Interscience, 
New York, 1989, 1992). It is appreciated to one skilled in the art that the pol>Tiucleotides 
of SEQ ID NOS: 1-14 and 57-60 or fragments thereof (comprising at least 12 contiguous 
nucleoudes), are particularly useful probes. Other particular useful probes for this purpose 
are hybridizable fragments to the sequences of SEQ ID NOS: 1-14 and 57-60 (/.e, 
comprising at least 12 contiguous nucleotides). 

With respect to nucleic acid sequences which hybridize to specific nucleic acid 
sequences disclosed herein, hybridization may be canried out under conditions of reduced 
stringency, medium stringency or even stringent conditions. As an example of 
oligonucleotide hybridization, a polymer membrane containing immobilized denatured 
nucleic acids is fu:^t prehybridized for 30 minutes at 45 °C in a solution consisting of 0.9 M 
NaCL 50 mM NaH.PO^, pH 7.0. 5.0 mM Na>EDTA, 0.5% SDS, lOX Denhardt's, and 0.5 
mg/ml polyriboadenylic acid. Approximately 2X10' cpm (specific activity 4-9 X ll) 
cpm/ug) of "-P end-labeled oligonucleotide probe are then added to the solution. After 12- 
16 hours of incubation, the membrane is washed for 30 minutes at room temperature in IX 
SET (1 50 miM NaCl, 20 mM Tris hydrochloride, pH 7.8, 1 mM Na.EDTA) containing 0.5% 
SDS, followed by a 30 minute wash in fresh IX SET at Tm 10°C for the oligonucleotide 
probe. The membrane is then exposed to auto-radiographic film for detection of 
hybridization signals. 

Stringent conditions means hybridization will occur only if there is at least 90% 
identity, preferably at. least 95% identity and most preferably at least 97% identity between 
the sequences. Further, it is understood that a section of a 100 bps sequence that is 95 bps 
in length has 95% identity with the 1090 bps sequence from which it is obtained. See J. 
Sambrook et al. Molecular Cloning, A Laboratory Manual 2d Ed., Cold Spring Harbor 
Laboratory (1989) which is hereby incorporated by reference in its entirety. Also, it is 
understood that a fragment of a 100 bps sequence that is 95 bps in length has 95% identity 
with the 100 bps sequence from which it is obtained. 

As used herein, a first DNA (RNA) sequence is at least 70% and preferably at least 
80% identical to another DNA (RNA) sequence if there is at least 70% and preferably at 
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least a 80% or 90% identity, respectively, between the bases of the first sequence and the 
bases of the another sequence, when properly aligned with each other, for example when 

aligned by BLASTN. 

"Identity" as the term is used herein, refers to a polynucleotide sequence which 
comprises a percentage of the same bases as a reference polynucleotide (SEQ ID NOS: 1-14 
and 57-60). For example, a polynucleotide which is at least 90% identical to a reference 
polynucleotide, has polynucleotide bases which are identical in 90% of the bases which 
make up the reference polynucleotide and may have different bases in 10% of the bases 
which comprise that polynucleotide sequence. 

The present invention relates polynucleotides which differ from the reference 
polynucleotide such that the changes are silent changes, for example the change do not alter 
the amino acid sequence encoded by the polynucleotide. The present invention also relates 
to nucleotide changes which result in amino acid substitutions, additions, deletions, fusions 
and truncations in the polypeptide encoded by the reference polynucleotide. In a preferred 
aspect of the invention these polypeptides retain the same biological action as the 
polypeptide encoded by the reference polynucleotide. 

It is also appreciated that such probes can be and are preferably labeled with an 
analytically detectable reagent to facilitate identification of the probe. Useful reagents 
include but are not limited to radioactivity, fluorescent dyes or enzymes capable of 
catalyzing the formation of a detectable product. The probes are thus useful to isolate 
complementary copies of DNA from other sources or to screen such sources for related 
sequences. 

The polynucleotides of this invention were recovered from genomic gene libraries 
from the organisms listed in Table 1. For example, gene libraries can be generated in the 
Lambda ZAP II cloning vector (Stratagene Cloning Systems). Mass excisions can be 
performed on these libraries to generate libraries in the pBluescript phagemid. Libraries 
are thus generated and excisions performed according to the protocols/methods hereinafter 
described. 
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The excision libraries are introduced into the £ coli strain BW14893 FkanlA. 
Expression clones are then identified using a high temperature filter assay. Expression 
clones encoding several glucanases and several other glycosidases are identified and 
repurified. The polynucleotides, and enzymes encoded thereby, of the present invention, 
yield the activities as described above. 

The coding sequences for the enzymes of the present invention were identified by 
screening the genomic DNAs prepared for the clones having glucosidase or galactosidase 
activity. 

An example of such an assay is a high temperature filter assay wherein expression 
clones were identified by use of high temperature filter assays using buffer Z (see recipe 
below) containing 1 mg/ml of the substrate 5-bromo-4-chIoro-3-indolyl-p-D- 
glucopyranoside (XGLU) (Diagnostic Chemicals Limited or Sigma) after introducing an 
excision library into the £ coli strain BW 14893 FkanlA. Expression clones encoding 
XGLUases were identified and repurified fi-om Ml ITL, 0C1/4V, Pyrococcus fliriosus VCl, 
Staphylothemus marinus Fl, Thermococcus 9N-2, Thermotoga maritima MSB8, 
Thermococcus alcaliphilus AEDII12RA, and Thermococcus chitonophagus GC74. 

Z-buffer: (referenced in Miller, J.H. (1992) A Short Course in Bacterial Genetics, 

p. 445.) 

per liter: 

Na,HP04-7H,0 16.1g 
NaH^PO.-TH.O 5.5g 
KCl 0.75g 
MgSOrTH.O 0.246g 
P-mercaptoethanol 2.7ml 
Adjust pH to 7.0 

High Temperature Filter Assav 

( 1 ) The f factor f kan (fi-om E, coli strain CSH 1 1 8)( 1 ) was introduced into the pho-pnh- 
lac-strain BW14893(2). BW13893(2). The filamentous phage library was plated 
on the resuhing strain, BW14893 Fkan. (Miller, J.H. (1992) A Short Course in 
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Bacterial Genetics; Lee, K.S., Metcalf, et al., (1992) Evidence for two phosphonate 
degradative pathways in Enterobacter Aerogenes, J. BacterioL, 174:2501-2510, 

(2) After growth on 100 mm LB plates containing 100 |ig/ml ampicillin, 80 |ig/ml 
nethicillin and ImM IPTG, colony lifts were performed using Millipore HATF 

5 membrane filters. 

(3) The colonies transferred to the filters were lysed with chloroform vapor in 150 mm 
glass petri dishes. 

(4) The filters were transferred to 100 mm glass petri dishes containing a piece of 
Whatman 3 MM filter paper saturated with buffer. 

10 (a) when testing for galactosidase activity (XGALase), 3 MM paper was 

saturated with Z buffer containing 1 mg/ml XGAL (ChemBridge 
Corporation). After transferring filter bearing lysed colonies to the glass 
petri dish, placed dish in oven at 80-85 ''C. 

(b) when testing for glucosidase (XGLUase), 3 MM paper was saturated 
15 with Z buffer containing 1 mg/ml XGLU. After transferring filter bearing 

lysed colonies to the glass petri dish, placed dish in oven at 80-85 °C. 

(5) Positives' were observed as blue spots on the filter membranes. Used the following 
filter rescue technique to retrieve plasmid from lysed positive colony. Used pasteur 
pipette (or glass capillary tube) to cpre blue spots on the filter membrane. Placed 

20 the small filter disk in an Eppendorf tube containing 20 |il water. Incubated the 

Eppendorf tube at 75°C for 5 minutes followed by vortexing to elute plasmid DNA 
off filter. This DNA was transformed into electrocompetent £. coli cells DHIOB 
for Thermatoga maritima MSB8-6G, Staphylothermus marinus F 1-1 20, 
Thermococcus AEDII12RA-18B/G, Thermococcus chitonophagus GC74-22G, 

25 Ml ITl and 0C1/4V. Electrocompetent BW14893 Fkanl A £ coli were used for 

Thermococcus 9N2-3 IB/G, and Pyrococcus furiosus VC1-7GL Repeated filter-lift 
assay on transformation plates to identify 'positives'. Retum transformation plates 
to 37''C incubator after filter lift to regenerate colonies. Inoculate 3 ml LB liquid 
containing 100 \xg/m\ ampicillin with repurified posuives and incubate at 37''C 
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overnight. Isolate plasmid DNA from these cuhures and sequence plasmid insert. 
In some instances where the plates used for the initial colony lifts contained non- 
confluent colonies, a specific colony corresponding to a blue spot on the filter could 
be identified on a regenerated plate and repurified directly, instead of using the filter 
rescue technique. 

Another example of such an assay is a variation of the high temperature filter assay 
wherein colony-laden filters are heat-killed at different temperatures (for example, 105°C 
for 20 minutes) to monitor themiostability. The 3MiM paper is saturated with different 
buffers (i.e., 100 mM NaCl, 5 mM MgCl., 100 mM Tris-Cl (pH 9.5)) to determine enzyme 
activity under different buffer conditions. 

A P-glucosidase assay may also be employed, wherein GlcpPNp is used as an 
artificial substrate (aryl-P-glucosidase). The increase in absorbance at 405 nm as a result 
of p-nitrophenol (pNp) liberation was followed on a Hitachi U-1100 spectrophotometer, 
equipped with a thermostatted cuvette holder. The assays may be performed at 80°C or 
90 °C in closed 1-ml quartz cuvette. A standard reaction mixture contains 150 mM 
trisodium substrate, pH 5.0 (at 80°C), and 0.95 mM pNp derivative pNp = 0.561 mM'' cm* 
'). The reaction mixture is allowed to reach the desired temperature, after which the 
reaction is started by injecting an appropriate amount of enzyme (1.06 ml final volume). 

1 U P-glucosidase activity is defined as that amount required to catalyze the 
formation of 1 .0 /umol pNp/min. D-cellobiose may also be used as a substrate. 

An ONPG assay for p-galactosidase activity is described by Miller, J.H. (1992) A 
Short Course in Bacterial Genetics and Mill, J.H. (1992) Experiments in Molecular 
Genetics, the contents of which are hereby incorporated by reference in their entirety. 

A quantitative fluorometric assay for p-galactosidase specific activity is described 
by : Youngman P., (1987) Plasmid Vectors for Recovering and Exploiting Tn917 
Transpositions in Bacillus and other Gram-Positive Bacteria. In Plasmids: A Practical 
approach (ed. K. Hardy) pp 79-1 03. IRL Press, Oxford, A description of the procedure can 
be found in Miller (1992) p. 75-77, the contents of which are incorporated by reference 
herein in their entirety, 
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The polynucleotides of the present invention may be in the form of DNA which 
DNA includes cDNA, genomic DNA, and synthetic DNA. The DNA may be double- 
stranded or single-stranded, and if single stranded may be the coding strand or non-coding 
(anti-sense) strand. The coding sequences which encodes the mature enzymes may be 
identical to the coding sequences shown in Figures 1-8 (SEQ ID NOS: 1-14 and 57-60) or 
may be a different coding sequence which coding sequence, as a result of the redundancy 
or degeneracy of the genetic code, encodes the same mature enzymes as the DNA of 
Figures 1-18 (SEQ ID NOS: 1-14 and 57-60). 

The polynucleotide which encodes for the mature enzyme of Figures 1-18 (SEQ ID 
NOS: 15-28 and 61-64) may include, but is not limited to: only the coding sequence for the 
mature enzyme; the coding sequence for the mature enzyme and additional coding sequence 
such as a leader sequence or a proprotein sequence; the coding sequence for the mature 
enzj-rae (and optionally additional coding sequence) and non-coding sequence, such as 
introns or non-coding sequence 5' and/or 3' of the coding sequence for the mature enzyme. 

Thus, the term "polynucleotide encoding an enzyme (protein)" encompasses a 
polynucleotide which includes only coding sequence for the enzyme as well as a 
polynucleotide which includes additional coding and/or non-coding sequence. 

The present invention further relates to variants of the hereinabove described 
polynucleotides which encode for fragments, analogs and derivatives of the enzymes having 
the deduced amino acid sequences of Figures 1-18 (SEQ ID NOS: 15-28 and 61-64). The 
variant of the polynucleotide may be a naturally occurring allelic variant of the 
polynucleotide or a non-naturally occurring variant of the polynucleotide. 

Thus, the present invention includes polynucleotides encoding the same mature 
enzymes as shown in Figures 1-18 (SEQ ID NOS: 15-28 and 61-64) as well as variants of 
such polynucleotides which variants encode for a fragment, derivative or analog of the 
enzymes of Figures 1-18 (SEQ ID NOS: 15-28 and 61-64). Such nucleotide variants 
include deletion variants, substitution variants and addition or insertion variants. 

As hereinabove indicated, the polynucleotides may have a coding sequence which 
is a naturally occurring allelic variant of the coding sequences shown in Figures 1-18 (SEQ 
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ID NOS: 1-14 and 57-60). As known in the an, an allelic variant is an alternate fonm of a 
polynucleotide sequence which may have a substitution, deletion or addition of one or more 
nucleotides, which does not substantially alter the function of the encoded enzyme. 

Fragments of the full length gene of the present invention may be used as a 
hybridization probe for a cDNA or a genomic library to isolate the flill length DNA and to 
isolate other DNAs which have a high sequence similarity to the gene or similar biological 
activity. Probes of this type preferably have at least 10, preferably at least 15, and even 
more preferably at least 30 bases and may contain, for example, at least 50 or more bases. 
The probe may also be used to identify a DNA clone corresponding to a full length 
transcript and a genomic clone or clones that contain the complete gene including 
regulatory and promotor regions, exons, and introns. An example of a screen comprises 
isolating the coding region of the gene by using the known DNA sequence to synthesize an 
oligonucleotide probe. Labeled oligonucleotides having a sequence complementary to that 
of the gene of the present invention are used to screen a library of genomic DNA to 
determine which members of the library the probe hybridizes to. 

The present invention further relates to polynucleotides which hybridize to the 
hereinabove-described sequences if there is at least 70%, preferably at least 90%, and more 
preferably at least 95% identity between the sequences. The present invention particularly 
relates to polynucleotides which hybridize under stringent conditions to the hereinabove- 
described polynucleotides. As herein used, the term "stringent conditions" means 
hybridization will occur only if there is at least 95% and preferably at least 97% identity 
between die sequences. The polynucleotides which hybridize to the hereinabove described 
polynucleotides in a preferred embodiment encode enzymes which either retain 
substantially the same biological function or activity as the mature enzyme encoded by the 
DNA of Figures 1-18 (SEQ ID NOS: 1-14 and 57-60). 

Alternatively, the polynucleotide may have at least 15 bases, preferably at least 30 
bases, and more preferably at least 50 bases which hybridize to any part of a polynucleotide 
of the present invention and which has an identity thereto, as hereinabove described, and 
which may or may not retain activity. For example, such polynucleotides may be employed 
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as probes for the polynucleotides of SEQ ID NOS: 1-14 and 57-60. for example, for 
recovery of the polynucleotide or as a diagnostic probe or as a PCR primer. 

Thus, the present invention is directed to polynucleotides having at least a 70% 
identity, preferably at least 90% identity and more preferably at least a 95% identity to a 
polynucleotide which encodes the enzymes of SEQ ID NOS: 15-28 and 61-64 as well as 
fragments thereof which fragments have at least 1 5 bases, preferably at least 30 bases and 
most preferably at least 50 bases, which fragments are at least 90% identical, preferably at 
least 95% identical and most preferably at least 97% identical under stringent conditions 
to any portion of a polynucleotide of the present invention. 

The present invention fiirther relates to enzymes which have the deduced amino acid 
sequences of Figures 1-18 (SEQ ID NOS: 15-28 and 61-64) as well as fragments, analogs 
and derivatives of such enzyme. 

The terms "fragment," "derivative" and "analog" when referring to the enzymes of 
Figures 1-18 (SEQ ID NOS: 15-28 and 61-64) means enzymes which retain essentially the 
same biological function or activity as such enzymes. Thus, an analog includes a proprotein 
which can be activated by cleavage of the proprotein portion to produce an active mature 
enzyme. 

The enzymes of the present invention may be a recombinant enzyme, a natural 
enzyme or a synthetic enzyme, preferably a recombinant enzyme. 

The fragment, derivative or analog of the enzymes of Figures 1-1 8 (SEQ ID NOS: 
15-28 and 61-64) may be (i) one in which one or more of the amino acid residues are 
substituted with a conserved or non-conserved amino acid residue (preferably a conserved 
amino acid residue) and such substituted amino acid residue may or may not be one 
encoded by the genetic code, or (ii) one in which one or more of the amino acid residues 
includes a substituent group, or (iii) one in which the mature enzyme is fused with another 
compound, such as a compound to increase the half-life of the enzyme (for example, 
polyethylene glycol), or (iv) one in which the additional amino acids are ftised to the mature 
enzyme, such as a leader or secretory sequence or a sequence which is employed for 
purification of the mature enzyme or a proprotein sequence. Such fragments, derivatives 
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and analogs are deemed to be within the scope of those skilled in the art from the teachings 
herein. 

The enzymes and polynucleotides of the present invention are preferably provided 
in an isolated form, and preferably are purified to homogeneity. 

The term "isolated" means that the material is removed from its original 
environment (e.g., the natural environment if it is naturally occurring). For example, a 
naturally-occurring polynucleotide or enzyme present in a living animal is not isolated, but 
the same polynucleotide or enzyme, separated from some or all of the coexisting materials 
in the natural system, is isolated. Such polynucleotides could be part of a vector and/or 
such polynucleotides or enzymes could be part of a composition, and still be isolated in that 
such vector or composition is not part of its natural environment. 

The enzymes of the present invention include the enzymes of SEQ ID NOS: 15-28 
and 61-64 (in particular the mature enzyme) as well as enzymes which have at least 70% 
similarity (preferably at least 70% identity) to the enzymes of SEQ ID NOS: 1 5-28 and 6 1- 
64 and more preferably at least 90% similarity (more preferably at least 90% identity) to 
the enzymes of SEQ ID NOS: 15-28 and 61-64 and still more preferably at least 95% 
similarity (still more preferably at least 95% identity) to the enzymes of SEQ ID NOS: 15- 
28 and 61-64 and also include portions of such enzymes with such portion of the enzyme 
generally containing at least 30 amino acids and more preferably at least 50 amino acids. 

As known in the art "similarity" between two enzymes is determined by comparing 
the amino acid sequence and its conserved amino acid substitutes of one enzyme to the 
sequence of a second enzyme. 

A variant, i.e. a "fragment", "analog" or "derivative" polypeptide, and reference 
polypeptide may differ in amino acid sequence by one or more substitutions, additions, 
deletions, fusions and truncations, which may be present in any combination. 

Among preferred variants are those that vary from a reference by conservative 
amino acid substitutions. Such substitutions are those that substitute a given amino acid in 
a polypeptide by another amino acid of like characteristics. Typically seen as conservative 
substitutions are the replacements, one for another, among the aliphatic amino acids Ala, 
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Val, Leu and He; interchange of the hydroxyl residues Ser and Thr, exchange of the acidic 
residues Asp and Glu, substitution between the amide residues Asn and Gin, exchange of 
the basic residues Lys and Arg and replacements among the aromatic residues Phe, Tyr. 

Most highly preferred are variants which retain the same biological function and 
activity as the reference polypeptide from which it varies. 

Fragments or portions of the enzymes of the present invention may be employed for 
producing the corresponding frill-length enzyme by peptide synthesis: therefore, the 
fragments may be employed as intermediates for producing the frill-length enzymes. 
Fragments or portions of the polynucleotides of the present invention may be used to 
synthesize full-length polynucleotides of the present invention. 

The present invention also relates to vectors which include polynucleotides of the 
present invention, host cells which are genetically engineered with vectors of the invention 
and the production of enzymes of the invention by recombinant techniques. 

Host cells are genetically engineered (transduced or transformed or transfected) with 
the vectors of this invention which may be, for example, a cloning vector or an expression 
vector. The vector may be, for example, in the form of a plasmid, a viral particle, a phage, 
etc. The engineered host cells can be cultured in conventional nutrient media modified as 
appropriate for activating promoters, selecting transformants or amplifying the genes of the 
present invention. The culture conditions, such as temperature, pH and the like, are those 
previously used with the host cell selected for expression, and will be apparent to the 
ordinarily skilled artisan. 

The polynucleotides of the present invention may be employed for producing 
enzymes by recombinant techniques. Thus, for example, the polynucleotide may be 
included in any one of a variety of expression vectors for expressing an enzyme. Such 
vectors include chromosomal, nonchromosomal and synthetic DNA sequences, e.g., 
derivatives of SV40; bacterial plasmids; phage DNA; bacuiovirus; yeast plasmids; vectors 
derived from combinations of plasmids and phage DNA, viral DNA such as vaccinia, 
adenovirus, fowl pox virus, and pseudorabies. However, any other vector may be used as 
long as it is replicable and viable in the host. 
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The appropriate DNA sequence may be inserted into the vector by a variety of 
procedures. In general, the DNA sequence is inserted into an appropriate restriction 
endonuclease site(s) by procedures known in the- art. Such procedures and others are 
deemed to be within the scope of those skilled in the art. 

The DNA sequence in the expression vector is operatively linked to an appropriate 
expression control sequence(s) (promoter) to direct mRNA synthesis. As representative 
examples of such promoters, there may be mentioned: LTR or S V40 promoter, the E. coli. 
lac or 112, ^he phage lambda promoter and other promoters known to control expression 
of genes in prokaryotic or eukaryotic cells or their viruses. The expression vector also 
contains a ribosome binding site for translation initiation and a transcription terminator. 
The vector may also include appropriate sequences for amplifying expression. 

In addition, the expression vectors preferably contain one or more selectable marker 
genes to provide a phenotypic trait for selection of transformed host cells such as 
dihydrofolate reductase or neomycin resistance for eukaryotic cell culture, or such as 
tetracycline or ampicillin resistance in E. coli . 

The vector containing the appropriate DNA sequence as hereinabove described, as , 
well as an appropriate promoter or control sequence, may be employed to transform an 
appropriate host to permit the host to express the protein. 

As representative examples of appropriate hosts, there may be mentioned: bacterial 
cells, such as E. coli , Streptomvces , Bacillus subtilis : fungal cells, such as yeast; insect cells 
such as Drosophila S2 and Spodoptera Sf9; animal cells such as CHO, COS or Bowes , 
melanoma; adenoviruses; plant cells, etc. The selection of an appropriate host is deemed 
to be within the scope of those skilled in the art from the teachings herein. 

More particularly, the present invention -also includes recombinant constructs 
comprising one or more of the sequences as broadly described above. The constructs 
comprise a vector, such as a plasmid or viral vector, into which a sequence of the invention 
has been inserted, in a forward or reverse orientation. In a preferred aspect of this 
embodiment, the construct fiirther comprises regulatory sequences, including, for example, 
a promoter, operably linked to the sequence. Large numbers of suitable vectors and 
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promoters are known to those of skill in the art. and are commercially available. The 
following vectors are provided by way of example; Bacterial: pQE70, pQE60, pQE-9 
(Qiagen), pDlO, psiX174. pBluescnpt II KS, pNH8A, pNHl6a, pNHlSA. pNH46A 
(Stratagene); ptrc99a, pKK223-3, pKK233-3, pDR540, pRIT5 (Pharmacia); Eukaryotic: 
pSV2CAT, pOG44, pXTl, pSG (Stratagene) pSVK3, pBPV, pMSG, pSVL (Pharmacia). 
However, any other plasmid or vector may be used as long as they are replicable and viable 
in the host. 

Promoter regions can be selected from any desired gene using CAT 
(chloramphenicol transferase) vectors or other vectors with selectable markers. Two 
appropriate vectors are pKK232-8 and pCM7. Particular named bacterial promoters include 
lad, lacZ, T3, T7, gpt, lambda Pr, Pl and trp. Eukaryotic promoters include CMV 
immediate early, HSV thymidine kinase, early and late SV40, LTRs from retrovirus, and 
mouse metallothionein-I. Selection of the appropriate vector and promoter is well within 
the level of ordinary skill in the art. 

In a flirther embodiment, the present invention relates to host cells containing the 
above-described constructs. The host cell can be a higher eukaryotic cell, such as a 
mammalian cell, or a lower eukaryotic cell, such as a yeast cell, or the host cell can be a 
prokaryotic cell, such as a bacterial cell. Introduction of the construct into the host cell can 
be effected by calcium phosphate transfection, DEAE-Dextran mediated transfection, or 
electroporation (Davis, L., Dibner, M., Battey, I., Basic Methods in Molecular Biology, 
(1986)). 

The constructs in host cells can be used in a conventional manner to produce the 
gene product encoded by the recombinant sequence. Alternatively, the enzymes of the 
invention can be synthetically produced by conventional peptide synthesizers. 

Mature proteins can be expressed in mammalian cells, yeast, bacteria, or other cells 
under the control of appropriate promoters. Cell-free translation systems can also be 
employed to produce such proteins using RNAs derived from the DNA constructs of the 
present invention. Appropriate cloning and expression vectors for use with prokaryotic and 
eukaryotic hosts are described by Sambrook, et al.. Molecular Cloning: A Laboratory 
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Manual. Second Edition, Cold Spring Harbor, N.Y., (1989), the disclosure of which is 
hereby incorporated by reference. 

Transcription of the DNA encoding the enzymes of the present invention by higher 
eukaryotes is increased by inserting an enhancer sequence into the vector. Enhancers are 
cis-acting elements of DNA, usually about from 10 to 300 bp that act on a promoter to 
increase its transcription. Examples include the SV40 enhancer on the late side of the 
replication origin bp 100 to 270, a cytomegalovirus early promoter enhancer, the polyoma 
enhancer on the late side of the replication origin, and adenovirus enhancers. 

Generally, recombinant expression vectors will include origins of replication and 
selectable markers permitting transformation of the host cell, e.g., the ampiciilin resistance 
gene of E. coli and S. cerevisiae TRPl gene, and a promoter derived from a highly- 
expressed gene to direct transcription of a downstream structural sequence. Such promoters 
can be derived from operons encoding glycolytic enzymes such as 3-phosphoglycerate 
kinase (PGK), a-factor, acid phosphatase, or heat shock proteins, among others. The 
heterologous structural sequence is assembled in appropriate phase with translation 
initiation and termination sequences, and preferably, a leader sequence capable of directing 
secretion of translated enzyme. Optionally, the heterologous sequence can encode a fusion 
enzyme including an N-terminal identification peptide imparting desired characteristics, 
e.g., stabilization or simplified purification of expressed recombinant product. 

Useful expression vectors for bacterial use are constructed by inserting a structural 
DNA sequence encoding a desired protein together with suitable translation initiation and 
termination signals in operable reading phase with a functional promoter. The vector will 
comprise one or more phenotypic selectable markers and an origin of replication to ensure 
maintenance of the vector and to, if desirable, provide amplification within the host. 
Suitable prokaryotic hosts for transformation include E. coli . Bacillus subtilis . Salmonella 
tvphimurium and various species within the genera Pseudomonas, Streptomyces, and 
Staphylococcus, although others may also be employed as a matter of choice. 

As a representative but nonlimiting example, useful expression vectors for bacterial 
use can comprise a selectable marker and bacterial origin of replication derived from 
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commercially available plasmids comprising genetic elements of the well known cloning 
vector pBRj22 (ATCC 37017). Such commercial vectors include, for example, pKK223-3 
(Pharmacia Fine Chemicals, Uppsala. Sweden) and GEMl (Promega Biotec, Madison, WI, 
USA). These pBR322 "backbone" sections are combined with an appropriate promoter and 
the structural sequence to be expressed. 

Following transformation of a suitable host strain and growth of the host strain to 
an appropriate cell density, the selected promoter is induced by appropriate means (e.g., 
temperature shift or chemical induction) and cells are cultured for an additional period. 

Cells are typically harvested by centrifiigation, disrupted by physical or chemical 
means, and the resulting crude extract retained for further purification. 

Microbial cells employed in expression of proteins can be disrupted by any 
convenient method, including freeze-thaw cycling, sonication, mechanical disruption, or 
use of cell lysing agents, such methods are well known to those skilled in the art. 

Various mammalian cell culture systems can also be employed to express 
recombinant protein. Examples of mammalian expression systems include the COS-7 lines 
of monkey kidney fibroblasts, described by Gluzman, Cell, 23:175 (1981), and other cell 
lines capable of expressing a compatible vector, for example, the C127, 3T3, CHO, HeLa 
and BHK cell lines. Mammalian expression vectors will comprise an origin of replication, 
a suitable promoter and enhancer, and also any necessary ribosome binding sites, 
polyadenylation site, splice donor and acceptor sites, transcriptional termination sequences, 
and 5' flanking nontranscribed sequences. DN A sequences derived firom the SV40 splice, 
and polyadenylation sites may be used to provide the required nontranscribed genetic 
elements. 

The enzyme can be recovered and purified from recombinant cell cultures by 
methods including ammonium sulfate or ethanol precipitation, acid extraction, anion or 
cation exchange chromatography, phosphocellulose chromatography, hydrophobic 
interaction chromatography, affinity chromatography, hydroxylapatite chromatography and 
lectin chromatography. Protein refolding steps can be used, as necessary, in completing 
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configuration of the mature protein. Finally, high performance liquid chromatography 
(HPLC) can be employed for final purification steps. 

The enz>Tnes of the present invention may be a naturally purified product, or a 
product of chemical synthetic procedures, or produced by recombinant techniques from a 
prokaryotic or eukaryotic host (for example, by bacterial, yeast, higher plant, insect and 
mammalian cells in culture). Depending upon the host employed in a recombinant 
production procedure, the enzymes of the present invention may be glycosylated or may be 
non-glycosylated. Enzymes of the invention may or may not also include an initial 
methionine amino acid residue. 

P-galactosidase hydrolyzes lactose to galactose and glucose. Accordingly, the 
0C1/4V, 9N2-31B/G, AEDII12RA-18B/G and F1-12G enzymes may be employed in the 
food processing industry for the production of low lactose content milk and for the 
production of galactose or glucose from lactose contained in whey obtained in a large 
amount as a by-product in the production of cheese. Generally, it is desired that enzymes 
used in food processing, such as the aforementioned P-galactosidases, be stable at elevated 
temperatures to help prevent microbial contamination. 

These enzymes may also be employed in the pharmaceutical industry. The enzymes 
are used to treat intolerance to lactose. In this case, a thermostable enzyme is desired, as 
well. Thermostable P-galactosidases also have uses in diagnostic applications, where they 
are employed as reporter molecules. 

Glucosidases act on soluble cellooligosaccharides from the non-reducing end to give 
glucose as the sole product. Glucanases (endo- and exo-) act in the depolymerization of 
cellulose, generatmg more non-reducing ends (endo-glucanases, for instance, act on internal 
linkages yielding cellobiose, glucose and cellooligosaccharides as products). P- 
glucosidases are used in applications where glucose is the desired product. Accordingly, 
MllTL, F1.12G, GC74-22G, MSB8-6G , 0C1/4V, VC1-7G1, 9N2-31B/G and 
AEDni2RA18B/G maybe employed in a wide variety of industrial applications, including 
in com wet milling for the separation of starch and gluten, in the fruit industry for 
clarification and equipment maintenance, in baking for viscosity reduction, in the textile 
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industry for the processing of blue jeans, and in the detergent industry as an additive. For 
these and other applications, thermostable enzymes are desirable. 

Antibodies generated against the enzymes corresponding to a sequence of the 
present invention can be obtained by direct injection of the enzymes into an animal or by 
administering the enzymes to an animal, preferably a nonhuman. TTie antibody so obtained 
will then bind the enzymes itself. In this manner, even a sequence encoding only a 
fragment of the enzymes can be used to generate antibodies binding the whole native 
enzymes. Such antibodies can dien be used to isolate the enzyme from cells expressing that 
enzyme. 

For preparation of monoclonal antibodies, any technique which provides antibodies 
produced by continuous cell line cultures can be used. Examples include the hybridoma 
technique (Kohler and Milstein, 1975, Nature, 256:495-497), the trioma technique, the 
human B-cell hybridoma technique (Kozbor et al., 1983, Immunology Today 4:72), and the 
EBV-hybridoma technique to produce human monoclonal antibodies (Cole, et al., 1985, m 
Monoclonal Antibodies and Cancer Therapy, Alan R. Liss, Inc., pp. 77-96). 

Techniques described for the production of single chain antibodies (U.S. Patent 
4,946,778) can be adapted to produce single chain antibodies to immunogenic enzyme 
products of this invention. Also, transgenic mice may be used to express humanized 
antibodies to immunogenic enzyme products of this invention. 

Antibodies generated against the enzyme of the present invention may be used in 
screening for similar enzymes from other organisms and samples. Such screening 
techniques are known in the art, for example, one such screening assay is described in 
"Methods for Measuring Cellulase Activities", Methods in enzymology, Vol 160, pp. 87- 
1 16, which is hereby incorporated by reference in its entirety. 

The present invention will be further described with reference to the following 
examples; however, it is to be understood that the present invention is not limited to such 
examples. All parts or amounts, unless otherwise specified, are by weight. 

In order to facilitate understanding of the following examples certain frequently 
occurring methods and/or terms will be described. 
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"Plasmids" are designated by a lower case p preceded and/or followed by capital 
letters and/or numbers. The starting plasmids herein are either commercially available, 
publicly available on an unrestricted basis, or can be constructed from available plasmids 
in accord with published procedures. In addition, equivalent plasmids to those described 
are known in the art and will be apparent to the ordinarily skilled artisan. 

"Digestion" of DNA refers to catalytic cleavage of the DNA with a restriction 
enzyme that acts only at certain sequences in the DNA. The various restriction enzymes 
used herein are commercially available and their reaction conditions, cofactors and other 
requirements were used as would be known to the ordinarily skilled artisan. For analytical 
purposes, typically 1 ^g of plasmid or DNA fragment is used with about 2 units of enzyme 
in about 20 ^il of buffer solution. For the purpose of isolating DNA fragments for plasmid 
construction, typically 5 to 50 ^g of DNA are digested with 20 to 250 units of enzyme in 
a larger volume. Appropriate buffers and substrate amounts for particular restriction 
enzymes are specified by the manufacturer. Incubation times of about 1 hour at 3TC are 
ordinarily used, but may vary in accordance with the supplier's instructions. After digestion 
the reaction is electrophoresed directly on a polyacryiamide gel to isolate the desired 
fragment. 

Size separation of the cleaved fragments is performed using 8 percent 
polyacryiamide gel described by Goeddel. D. et ai, Nucleic Acids Res., 8:4057 (1980). 

"Oligonucleotides" refers to either a single stranded polydeoxynucleotide or two 
complementary polydeoxynucleotide strands which may be chemically synthesized. Such 
synthetic oligonucleoddes have no 5' phosphate and thus will not ligate to another 
oligonucleotide without adding a phosphate with an ATP in the presence of a kinase. A 
synthetic oligonucleotide will ligate to a fragment that has not been dephosphorylated. 

"Ligation" refers to the process of forming phosphodiester bonds between two 
double stranded nucleic acid fragments (Maniatis, T., et al., Id., p. 146). Unless otherwise 
provided, ligation may be accomplished using known buffers and conditions with 10 units 
of T4 DNA ligase ("ligase") per 0.5 ^g of approximately equimolar amounts of the DNA 
fragments to be ligated. 
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Unless otherwise stated, transformation was performed as described in the method 
of Graham, F. and Van der Eb, A., Virology. 52:456-457 (1973). 

Example 1 

Bacterial Expression and Purification of Glvco sidase Enzvmes 

DNA encoding the enzymes of the present invention, SEQ ID NOS: 1-14 and 57-60 
were initially amplified from a pBluescript vector containing the DNA by the PGR 
technique using the primers noted herein. The amplified sequences were then inserted into 
the respective PQE vector listed beneath the primer sequences, and the enzyme was 
expressed according to the protocols set forth herein. The 5' and 3' primer sequences for 
the respective genes are as follows: 

Thermococcus AEDII12RA -1 8B/G 

5' CCGAGAATrCATTAAAGAGGAGAAATrAACTATGGTGAATGCTATGATTGTC 3' (SEQ ID NO:29) 
3- CGOAAOATCTTCATAGCTCCGGAAGCCCATA 5' (SEQ ID NO:30) 

Vector: pQE12; and contains the following restriction enzyme sites 5' EcoRI and 3' Big 
II. 

OC1/4V-33B/G 

5' CCGAGAATTCATTAAAGAGGAGAAATTAACTATGATAAGAAGGTCCGATnTCC 3' 
(SEQIDN0:31) 

3' CGGAAGATCTTTAAGAnTTAGAAATTCCTT 5' (SEQ ID NO;32) 

Vector: pQEI2; and contains the following restriction enzyme sites 5' EcoRI and 3' Bgl 
II. 

Thermococcus 9N2 - 3 IB/G 

5' CCGAGAATTCATTAAAGAGGAGAAATTAACTATGCTACCAG.^AGGCTTTCTC 3' 
(SEQIDNO:33) 

3' CGGAGGTACCTCACCCAAGTCCGAACTTCTC 5' (SEQ ID NO:34) 

Vector: pQE30; and contains the following restriction enzyme sites 5" EcoRI and 3' 
KpnI. 
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Staphylothermus marinus V \ - 12G 

5' CCGAGAATTCATTAAAGAGGAGAA^n/VACTATGATAAGGTTTCCTGATTAT 3' 
(SEO IDNO:35) 

3' CGGAAGATCTTTATTCGAGGTTCTTTAATCC 5' (SEQ ID NO:36) 

Vector: pQE12; and contains the following restriction enzyme sites 5* EcoRI and 3' Bgl 
II. 

Thermococcus chitonophagus GC74 - 22G 

5' CCGAGAATTCATTCATTAAAGAGGAGAAATTAACTATGCTTCCAGGAGA.ACTTTCTC 3' 
(SEQ ID NO:37) 

3* CGGAGGATCCCTACCCCTCCTCTAAGATCTC 5' (SEQ ID NO:38) 

Vector: pQE12; and contains the following restriction enzyme sites 5' EcoRI and 3' 
BamHI. 

MllTL 

5' AATAATCTAGAGCATGCAATTCCCCAAAGACTTCATGATAG 3' (SEQ ID NO;39) 
3* AATAAAAGCrrACTGGATCAGTGTAAGATGCT 5' (SEQ ID NO:40) 

Vector: pQE70; and contains the following restriction enzyme sites 5' SphI and 3' Hind 
ffl. 

Thermotoga maritima MSB8-6G 

5' CCGACAATTGATTAAAGAGGAGAAATTAACTATGGAAAGGATCGATGAAATT 3' (SEQ ID N0:41) 
3* CGGAGGTACCTCATGGTTTGAATCTCTTCTC 5" (SEQ ID NO:42) 

Vector: pQEl 2; and contains the following restriction enzyme sites 5' EcoRI and 3' 
KpnI. 

Pyrococcus furiosiis VCl - 7G1 

5' CCGACAATTGATTAAAGAGGAGAAATTAACTATGTTCCCTGAAAAGTTCCTT 3' (SEQ ID NO:43) 
3* CGGAGGTACCTCATCCCCTCAGCAATTCCTC 5* (SEQ ID NO:44) 

Vector: pQE12; and contains the following restriction enzyme sites 5' EcoRI and 3' Kpn 
I. 
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Bankia gouldi endoglucanase (37GP1) 

5' AATAAGGATCCOnTAGCGACGCTCGC 3' (SEQ ID NO:45) 

3' AATAAAAGCTTCCGGGrrGTACAGCGGT,-^TAGGC 5' (SEQ ID NO;46) 

Vector: pQE52; and contains the following restriction enzyme sites 5" 
Hind m. 



Thermotoga maritima a-galactosidase (6GC2) 

S'TITATTGAATTCArTAAAGAGGAGAAAITAACTATGATCTGTGTGGAAATAITCGGAAAG 
(SEQIDNO:47) 

3' TCTATAAAGCmCATrCTCTCTCACCCTCTTCGTAGAAG 5' {SEQ ID NO:48) 

Vector: pQET; and contains the following restriction enzyme sites 5' EcoRI 
III. 



Thermotoga maritima fi-mannanase (6GP2) 

5- rrrATrCAArrGATTAAAGAGGAGAAArrAACTATGGGGATTGGTGGCGACGAC 3' 
(SEQ ID NO:49) 

3- -nTATTAAGCTTATCTnTCATATrCACATACCTCC 5' (SEQ ID NO:50) 

Vector: pQEt; and contains the following restriction enzyme sites 5' H 
EcoRI. 



AEPIIla 6-mannanase(63GBl) 

5- -rrrA-rrGAATrCATrAAAGAGGAGAAATTAACTATGCTACCAOAAGAG-rrCCTATGGGGC 3' 
(SEQIDNOJl) 

3-nTATrAAGCTrCTCATCAACGGCTATGGTCTTCArnC 5' (SEQ IDNO:52) 

Vector: pQEt; and contains the following restriction enzyme sites 5' Hind III a 
EcoRI. 



0C1/4V endoglucanase (33GP1) 

S'AAAAAACAATTGAATTCArrAAAGAGGAGAAATTAACTATGGTAGAAAGACACTTCAGATATGTTCTT 
y (SEQIDNO:53) 

3' TmrCGGATCCAA-rrcrrCA-nTACTC-nTGCCTG 5' (SEQ ID NO:34) 
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Vector: pQEt; and contains the following restriction enzyme sites 5' BamHI and 3' 
EcoRI. 

Thermotoga maritima pullalanase (6GP3) 

5' TTTTGGAATTCATT.AAAGAGGAGAA-\TTA.ACTATGGAACTGATCATAGAAGGTTAC 3' 
(SEQIDNO:55) 

3* ATAAGAAGCTTTTCACTGTCTGTACAGAACGTACGC 5' (SEQ ID NO:56) 

Vector: pQEt; and contains the following restriction enzyme sites 5' EcoRI and 3' Hind 

m. 

The restriction enzyme sites indicated correspond to the restriction enzyme sites on 
the bacterial expression vector indicated for the respective gene (Qiagen, Inc. Chatsworth, 
CA). The pQE vector encodes antibiotic resistance (Amp"), a bacterial origin of replication 
(on), an IPTG-regulatable promoter operator (P/0), a ribosome binding site (RBS), a 6-His 
tag and restriction enzyme sites. 

The pQE vector was digested with the restriction enzymes indicated. The amplified 
sequences were ligated into the respective pQE vector and inserted in frame with the 
sequence encoding for the RBS. The ligation mixture was then used to transform the E. coli 
strain MI5/pREP4 (Qiagen, Inc.) by electroporation. M15/pREP4 contains multiple copies 
of the plasmid pREP4, which expresses the lad repressor and also confers kanamycin 
resistance (Kan*). Transformants were identified by their ability to grow on LB plates and 
ampicillin/kanamycin resistant colonies were selected. Plasmid DNA was isolated and 
confirmed by restriction analysis. Clones containing the desired constructs were grown 
overnight (0/N) in liquid culture in LB media supplemented with both Amp (100 ug/ml) 
and Kan (25 ug/ml). The 0/N culture was used to inoculate a large culture at a ratio of 
1 : 100 to 1 :250. The cells were grown to an optical density 600 (O.D.^°°) of between 0.4 and 
0.6. IPTG ("Isopropyl-B-D-thiogalacto pyranoside") was then added to a final 
concentration of 1 mM. IPTG induces by inactivating the lad repressor, clearing the P/0 
leading to increased gene expression. Cells were grown an extra 3 to 4 hours. Cells were 
then harvested by centrifugation. 
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The primer sequences set out above may also be employed to isolate the target gene 
from the deposited material by hybridization techniques described above. 

Example 2 

Tsnlatinn of A Selected Clone From the De posited genomic clones 

A clone is isolated directly by screening the deposited material using the 
oligonucleotide primers set forth in Example 1 for the particular gene desired to be 
isolated. The specific oligonucleotides are synthesized using an Applied Biosystems 
DNA synthesizer. The oligonucleotides are labeled with ^"P- -ATP using T4 
polynucleotide kinase and purified according to a standard protocol (Maniatis et al., 
Molecular Cloning: A Laboratory Manual, Gold Spring Harbor Press, Cold Spring, NY, 
1982). The deposited clones in the pBluescript vectors may be employed to transform 
bacterial hosts which are then plated on 1 .5% agar plates to the density of 20,000- 
50,000 pfu/1 50 mm plate. These plates are screened using Nylon membranes according 
to the standard screening protocol (Stratagene, 1993). Specifically, the Nylon 
membrane with denatured and fixed DNA is prehybridized in 6 x SSC, 20 mM 
NaHnPO^, 0.4%SDS, 5 x Denhardt's 500 ng/ml denatured, sonicated salmon sperm 
DNA; and 6 x SSC, 0.1% SDS. After one hour of prehybridization, the membrane is 
hybridized with hybridization buffer 6xSSC. 20 mM NaH,PO„ 0.4%SDS, 500 ug/ml 
denatured, sonicated salmon speim DNA with 1x10* cpm/ml ^-P-probe overnight at 
42°C. The membrane is washed at 45-50^ with washing buffer 6 x SSC, 0.1% SDS 
for 20-30 minutes dried and exposed to Kodak X-ray film overnight. Positive clones are 
isolated and purified by secondary and tertiary screening. The purified clone is 
sequenced to verify its identity to the primer sequence. 

Once the clone is isolated, the two oligonucleotide primers corresponding to the 
gene of interest are used to amplify the gene from the deposited material. A polymerase 
chain reaction is carried out in 25 ^l of reaction mixture with 0.5 ug of the DNA of the 
gene of interest. The reaction mixture is 1.5-5 mM MgCl,, 0.01% (w/v) gelatin, 20 ^M 
each of dATP, dCTP, dOTP, dTTP, 25 pmol of each primer and 0.25 Unit of Taq 
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polymerase. Thirty five cycles of PGR (denaturation at 94 °C for 1 min; annealing at 
SS^'C for 1 min; elongation at 72 °C for 1 min) are performed with the Perkin-Elmer 
Cetus automated thermal cycler. The amplified product is analyzed by agarose gel 
electrophoresis and the DNA band with expected molecular weight is excised and 
purified. The PGR product is verified to be the gene of interest by subcloning and 
sequencing the DNA product. The ends of the newly purified genes are nucleotide 
sequenced to identify full length sequences. Gomplete sequencing of full length genes is 
then performed by Exonuclease III digestion or primer walking. 

Example 3 
Screening for Galactosidase Activity 

Screening procedures for a-galactosidase protein activity may be assayed for as 
follows: 

Substrate plates were provided by a standard plating procedure. Dilute XLl - 
Blue MRP £ coli host of (Stratagene Cloning Systems, La Jolla, GA) to O-D.^qo = 1 .0 
with NZY media. In 15 ml tubes, inoculate 200 /ul diluted host cells with phage. Mix 
gently and incubate tubes at 37 °C for 15 min. Add approximately 3.5 ml LB top 
agarose (0.7%) containing ImM IPTG to each tube and pour onto all NYZ plate surface. 
Allow to cool and incubate at 37 °C ovemight. The assay plates are obtained as 
substrate p-Nitrophenyl a-galactosidase (Sigma) (200 mg/100 ml) (100 mM NaCl, 100 
mM Potassium-Phosphate) 1% (w/v) agarose. The plaques are overlayed with 
nitrocellulose and incubated at 4 °G for 30 minutes whereupon the nitrocellulose is 
removed and overlayed onto the substrate plates. The substrate plates are then incubated 
at 70 °Cfor20 minutes. 
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FAample 4 

<;pr<'pnin a of Clon p< fnr Mannanase Activity 

A solid phase screening assay was utilized as a primary screening method to test 

clones for B-mannanase activity. 

A culture solution of the Yl090-£ coli host strain (Stratagene Cloning Systems, 
La JoUa, CA) was diluted to O.D.,oo=l.O with NZY media. The amplified iibraiy from 
Themotoga maritima lambda gtll library was diluted in SM (phage dilution buffer): 5 
X 10' pfo/Ml diluted 1:1000 then 1:100 to 5 x 10^ pW. Then 8 ^1 of phage dilution 
(5x10^ pfu/^1) was plated in 200 ^1 host cells. They were then incubated in 1 5 ml 

tubes at 37 °C for 15 minutes. 

Approximately 4 ml of molten, LB top agarose (0.7%) at approximately 52 °C 
was added to each tube and the mixture was poured onto the surface of LB agar plates. 
The agar plates were then incubated at 37 °C for five hours. Tbe plates were replicated 
and induced with 10 mM IPTG-soaked Duralon-UV™ nylon membranes (Stratagene 
Cloning Systems. La Jolla, CA) overnight. The nylon membranes and plates were 
marked with a needle to keep their orientation and the nylon membranes were then 

removed and stored at 4 "C. 

An Azo-galactomannan overlay was applied to the LB plates containing the 
lambda plaques. The overiay contains 1% agarose. 50 mM potassium-phosphate buffer 
pH7,0.4%Azocarob-galactomannan. (Megazyme, Australia). The plates were 
incubated at 72 =C. The Azocarob-galactomannan treated plates were observed after 4 
hours then returned to incubation overnight. Putative positives were identified by 
clearing zones on the Azocarob-galactomannan plates. Two positive clones were 
observed. 

The nylon membranes referred to above, which correspond to the positive clones 
were retrieved, oriented over the plate and the portions matching the locations of the 
clearing zones for positive clones wre cut out. Phage was eluted from the membrane 
cut-out'^portions by soaking the individual portions in 500 ^1 SM (phage dilution buffer) 
and 25 pi CHCI3. 
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Example 5 

Screening of Clones for Mannosidase Activity 

A solid phase screening assay was utilized as a primary screening method to test 
clones for B-mannosidase activity. 

A culture solution of the Y1090-£". coli host strain (Stratagene Cloning Systems, 
La JoUa, CA) was diluted to O.D,6oo=1.0 with NZY media. The amplified library from 
AEPII la lambda gtll library was diluted in SM (phage dilution buffer): 5x10^ pfti/^il 
diluted 1:1000 then 1:100 to 5 x 10- pfti/^l. Then 8 ^1 of phage dilution 
(5x10^ pfu/|il) was plated in 200 [i\ host cells. They were then incubated in 15 ml 
tubes at 37 °C for 1 5 minutes. 

Approximately 4 ml of molten, LB top agarose (0.7%) at approximately 52 °C 
was added to each tube and the mixture was poured onto the surface of LB agar plates. 
The agar plates were then incubated at 37 °C for five hours. The plates were replicated 
and induced with 10 miVI IPTG-soaked Duralon-UV™ nylon membranes (Stratagene 
Cloning Systems, La Jolla, CA) overnight. The nylon membranes and plates were 
marked with a needle to keep their orientation and the nylon membranes were then 
removed and stored at 4 °C. 

A p-nitrophenyl-B-D-manno-pyranoside overlay was applied to the LB plates 
containing the lambda plaques. The overlay contains 1% agarose, 50 mM potassium- 
phosphate buffer pH 7, 0.4% p-nitrophenyl-B-D-manno-pyranoside. (Megazyme, 
Australia). The plates were incubated at 72 °C. The p-nitrophenyl-B-D-manno- 
pyranoside treated plates were observed after 4 hours then returned to incubation 
overnight. Putative positives were identified by clearing zones on the p-nitrophenyl-fl- 
D-manno-pyranoside plates. Two positive clones were observed. 

The nylon membranes referred to above, which correspond to the positive clones 
were retrieved, oriented over the plate and the portions matching the locations of the 
clearing zones for positive clones wre cut out. Phage was eluted from the membrane 
cut-out portions by soaking the individual portions in 500 |il SM (phage dilution buffer) 
and 25 ^1 CHCI3. 
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Example 6 
Srreenin{ > for Pullulanase Activity 

Screening procedures for pullulanase protein activity may be assayed for as 
follows: 

Substrate plates were provided by a standard plating procedure. Host cells are 
diluted to O.D.,ao = 1-0 with NZY or appropriate media. In 15 ml tubes, inoculate 200 
Ml diluted host cells with phage. Mix gently and incubate tubes at 37 "C for 15 min. 
Add approximately 3.5 ml LB top agarose (0.7%) is added to each tube and the mixture 
is plated, allowed to cool, and incubated at 37°C for about 28 hours. Overlays of 4.5 
mis of the following substrate are poured: 



100 ml total volume 

0.5g Red Pullulan Red (Megazyme, Australia) 

l.Og Agarose 

5ml Buffer (Tris-HCL pH 7.2 @ 75 °C) 

2ml 5MNaCl 

5ml CaCU(lOOmM) 

85ml dHjO 
Plates are cooled at room temperature, and thenm incubated at 75 °C for 2 hours. 
Positives are observed as showing substrate degradation. 

Example 7 
Screening for EndoRlucanase Activity 

Screening procedures for endoglucanase protein activity may be assayed for as 
follows: 

1 . The gene library is plated onto 6 LB/GelRite/0. 1 % CMC/NZY agar plates 
(-4,800 plaque forming units/plate) in E.coli host with LB agarose as top agarose. The 
plates are incubated at 37°C overnight. 
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2. Plates are chilled at 4°C for one hour. 

3. The plates are overlayed with Duralon membranes (Slratagene) at room 
temperature for one hour and the membranes are oriented and lifted off the plates and 
stored at 4 ^C. 

4. The top agarose layer is removed and plates are incubated at 37°C for -3 

hours. 

5. The plate surface is rinsed with NaCl. 

6. The plate is stained with 0.1% Congo Red for 1 5 minutes. 

7. The plate is destained with IM NaCl. 

8. The putative positives identified on plate are isolated from the Duralon 
membrane (positives are identified by clearing zones around clones). The phage is 
eluted from the membrane by incubating in 500|il SM + 25[il CHCI3 to elute. 

9. Insert DNA is subcloned into any appropriate cloning vector and 
subclones are reassayed for CMCase activity using the following protocol: 

i) Spin 1ml overnight miniprep of clone at maximum speed for 3 

minutes. 

ii) Decant the supernatant and use it to fill "wells" that have been 
made in an LB/GelRite/0.1% CMC plate. 

iii) Incubate at 37 °C for 2 hours. 

iv) Stain with 0. 1% Congo Red for 1 5 minutes. 

v) Destain with 1 M NaCl for 1 5 minutes. 

vi) Identify positives by clearing zone around clone. 

Numerous modifications and variations of the present invention are possible in 
light of the above teachings and, therefore, within the scope of the appended claims, the 
invention may be practiced otherwise than as particularly described. 
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WHAT IS CLAIMED IS : 

1 . An isolated polynucleotide selected from the group consisting of: 

(a) SEQ ID NOS: 1-14 and 57-60; 

(b) SEQ ID NOS: 1-14 and 57-60, wherein T can also be U; 

(c) polynucleotide sequences complementary to SEQ ID NOS: 1-14 and 5' 
60; 

(d) polynucleotide sequences which encode an amino acid sequence as set 
forth in SEQ ID NOS:15-28, and 61-64; and 

(e) fragments of (a), (b), (c) or (d) that are at least 1 5 consecutive bases in 
length and that will selectively hybridize to DNA which encodes a 
polypeptide of SEQ ID NOS: 15-28, and 61-64. 



2. A vector comprising a polynucleotide of claim 1 . 

3 . A host cell containing the vector of claim 2. 

4. The method of claim 3, wherein the host cell is a eukaryotic cell. 

5. The method of claim 3, wherein the host cell is a prokaryotic cell. 

6. A method for producing a polypeptide comprising: 

(a) culturing the host cells of claim 3 ; 

(b) expressing from the host cell of claim 3 a polypeptide encoded by said 
polynucleotide; and 

(c) isolating the polypeptide. 
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7. An enzyme selected from the group consisting of: 

(a) an enzyme comprising an amino acid sequence set forth in SEQ ID NOS: 
15-28 or 61-64; and 

(b) an enzyme which comprises at least 30 consecutive amino acid residue as 
an enzyme of (a). 

8. An enzyme of which at least a portion is coded for by a polynucleotide of 
claim 1, and which is selected from the group consisting of: 

(a) an enzyme comprising an amino acid sequence which is at least 70% 
identical to an amino acid sequence selected from the group of amino 
acid sequences set forth in SEQ ID NOS:15-28 or 61-64; and 

(b) an enzyme which comprises at least 30 amino acid residues to the 
enzyme of (a). 

9. A method for generating glucose from soluble cell oligosaccharides comprising 
contacting a sample containing oligosaccharides with an effective amount of an 
enyzme selected from the group consisting of an enzyme having the amino acid 
sequence set forth in SEQ ID NOS: 15-28, 61-63 and 64 such that glucose is 
produced. 

10. The method of cliam 9, wherein the sample is selected from the group consisting 
of dairy products, fruit juices, detergents, textiles, guar gum, animal feed, plant 

biomass and waste products. 

11. The method of claim 9, wherein the oligosaccharide is selected from the group 
consisting of maltose, cellobiose, lactose, sucrose, raffmose, stachyose, 
verbascose, cellulose, starch, amylose, glycogen, disacharrides, polysacharrides 
and pullulan. 
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MllTt CLYCOSIOXSE - 23G 
COMPLETE GENE SEQUENCE - 9/9 5 

t rri; AAA rir (■<•(■ aaa kac rn a'h. ata ta* ita -ii-r tca t t-c, rrr caa rrr *;aa <.i t 

I H.M (.ys rl,.. (»,.. I.y..; AS)! V\tf M.*| ll«< illy Tyt 'It-t ;>t Pro I'ln* t;l» I'lit- Cl.t A 1 .i 

r.) iKrr A'fT i-fi- cr:/; re*- iIac cat crt; aat A»rr uat tih; in-A -nit; urt; ^at t:Ar fci: ^:a(; 

.M Wly Crfi <.;ly •*.i:r r.»u Asp I'm* AtiM S.'i Asp Ttp Trp V/i I Trp Vrt / IliN A.sp Chi 10 

r.:i A^c ACA u:a r.r,-r r/-.A fn-A <.nr Ai;r (ur *:at tit rtr i;a(; aac ct'.c cca ct.-r ni^; aat iK(t 
III Afiu rh; Al.i Airt njy Umi Vol Sc-r Cly A/ip iMio Pro Clu Asn C3y t»ro Gly Tyi r> P Asn t.J 

lei 'ITA AAC CAA AAT CAC CAC CAC CTf: crr UAC AAC CTC CCC CTT AAC ACT ATT At'.A ITVA i;t:r: 2 'JO 
61 i.r'u Asn On A5n A5p His A.sp AM i:lu Uys Leu Cly Val Asn Thr Ua Arn Vol Gly 80 

241 err CAC TCC act ACC ATT TTT CCA AAC CCA ACT TTC AAT GTT AAA CTC CCT CTA GAG ACA 300 
81 Vdl Glu Trp Ser Arg He Phc Pro Ly* Pro Thr Ph« Asn Val Lys Val Pro Val Clu kzg 100 



360 
120 



420 
140 



4 80 

UO 



301 CAT GAG AAC GGC ACC ATT CTT CAC CTA CAT CTC GAT CAT AAA CCC CTT CAA ACA CTT CAT 

101 ASP Clu Asn Cly Ser Xie Val His Val Asp Val Asp Asp Lys Ala Val clu Ars L«u Asp 

361 CAA TTA CCC AAC AAC GAG CCC CTA AAC CAT TAC CTA CAA ATG TAT AAA CAC TGG CTT GAA 

121 Clu Lau Ala A»n Lys Glu Ala Val Aan His Tyr Val Clu Met Tyr Ly« A*p Trp val Clu 

421 ACA CCT ACA AAA CTT ATA CTC AAT TTA TAC CAT TCC CCC CTC CCT CTC TCC CTT CAC AAC 

141 Arg Cly Arg Lys Lau He Ltu Aan Uu Tyr His Trp Pro Leu Pro Leu Trp Leu His Asn 

481 CCA ATC ATC CTG ACA ACA ATC CCC CCC CAC ACA CCC CCC TCA CCC TGG CTT AAC CAC CAC 540 

161 Pro lie Hec Val Arg Arg Met Cly Pro Asp Arg Ala Pro S« Cly Trp Leu Asn Ciu Clu 130 

541 TCC CTG CTC GAG TTT GCC AAA TAC CCC CCA TAC ATT CCT TCC AAA ATG GGC GAG CTA CCT 

lei ser Val VaX Glu Pha Ala Ly. Tyr Ala Aia Tyr He Ala Trp Lys Met Cly Glu Leu Pro 

err ATG TCG ACC ACC ATG AAC GAA CCC AAC CTC CTT TAT CAC CAA CCA TAC ATG TTC GTT 



600 
200 



660 
220 



201 Val Nec Trp Ser Thr Met Asn Glu Pro Asn Val Val Tyr Clu Gin Gly Tyr Mac Phe Val 

661 AAA CCC GCT TTC CCA CCC OGC TAC TTC ACT TTC CAA CCT CCT GAT AAG CCC ACC AGA AAT 
221 Ly. Gly Cly Phe Pro Pro Cly Tyr Uu Ser Leu Clu Ala Ala Asp Lys Ala Arg Arg Asn 

721 ATC ATC CAG CCT CAT GCA CCC CCC TAT GAC AAT ATT AAA CCC TTC ACT AAC AAA CCT CTT 
241 Met He Cln Ala Hi. Ala Arg Ala Tyr Asp As* lie Lys Arg Phe Ser Lys Ly. Pro Val 260 

CCA CTA ATA TAC CCT TTC CAA TCC TTC CAA CTA TTA CAC CCT CCA GCA CAA CTA TTT CAT 



720 
340 



780 



640 
280 



900 
300 



960 
320 



1020 
340 



1080 
360 



281 Lys Ph. Ly. ser Ser Ly. Leu Tyr Tyr Phe Thr Asp Ue Val Sar Lys Cly ser Ser He 

... CTT CAA TAC ACC ACA CAT CTT CCC AAT ACG CTA GAC TCC TTG GGC GTT AAC TAC 

ne Asn val Glu Tyr Arg Arg Asp Leu Ala Asa Arg Leu Asp Trp Leu Cly Val Asn Tyr 

961 TAT ACC CCT TTA CTC TAC AAA ATC CTC CAT GAC AAA CCT ATA ATC CTC CAC CCC TAT CCA 

321 Tyr ser Arg Leu Val Tyr Lys He Val Asp Asp Lys Pro 11. Ha Leu His Gly Tyr Gly 

1021 TTC err TCT ACA CCT CCC CCC ATC ACC CCC CCT CAA AAT CCT TCT ACC CAT 

"l Phe Leu Cys Thr Pro Gly Cly He Ser Pro Ala Glu Asn Pro Cy. Ser Asp Phe Cly Trp 

10«1 CAG CTC TAT CCT CAA CCA CTC TAC CT^ CTT CTA AAA GAA CTT TAC AAC CCA TAC CCC CTA 1140 

361 Clu Val Tyc Pro Glu Cly Leu Tyi Leu Leu Leu Ly. Clu Leu Tyr Asn Arg Tyr Gly Val 380 

U41 CAC rrc atc ctc ac-c cac aac cct ctt tca cac acc: ac;.; uat ccc ttc aca ccc cca tac i^oo 

1«1 ASP Leu He Val Thr Clu Asn Cly Val Ser Asp Ser Aro As,. Ala Leu Arg Pro Ala Tyr 40O 

UOl CTC CTC TCG CAT >Tn TAC A<:C CTA TtT. AAA GCC GCf AAC CAC e.CC ATT CCC CtC AAA CC^" ll^*>'> 

401 UH. val r...r ihs vni Ty. ser Val Trp Lys Alo AIh am. Clu cly lie rro VaJ Uys Oy -i.!* 

I .LI TAC on- . A.- Ti;.; mu- nr. aca ^;Ar aat rAC cm: riu: r.rr ta.: i;f:c riv act: cm: aaa rn , » m) 

V^.l Ty< U.U iiu; Tm^ I Thf Anp Am. Tyr T,„ Al.. vAn rjy H». Arg VAn Uyy, rl. 
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(>0 
30 



I AW ATA AC* ACO Ttf BAT TTT <TA AAA CAT TTT ATC TTC CCA ACT. .»T A.r, .a'A <X.-A TAC 
. Hc. II. Are *t9 S«r A.p t'h- .-..^ i.y. A«B Ph. lU Cly Thr AU Tlir AlA Al. Tyr 

M .-AO m CM OCT CCA CCA AAC CAA .;AT CCC ACA COC CCA TCA ATT Tc:C .:aT CTC I-TT TCA 
:i Cln tit 61u Cly Al. Al. A»n clu A.i. Cly Ar9 CJy l>ro S«f U. Trp Aip «a1 P.Ii« Ser 

IJl CAC ACC CCT 6CC AAA ACC CTC. AAI.' IXT OAC ACA CCA SAC CTT CCC TCT rjVC CAT TAT CAC 
11 ll>> Thr Pro Sly L,» Thr Lc. Asn Cly A,p Thr Oly A.p v»l Al. Cys A.p K>s Tyr H.s 

IHl CfA TAC AAB CAA CAT ATC CAO CTC ATC AAA CAA ATA CGO TTA CAC CCT TAC ACC TTC TCT 
61 Atfl Tyr ty. Clu A.p IJ. Cln L«u M.c Ly. Ciu li. flly Uu A.p Al* Tyr Ar, Ph. S«r 

341 ATC TCC TCC CCC ACA ATT ATC CCA OAT CCC AAO AAC ATC AAC CAA AAC CCT CTC CAT TTC 
«l n. S.r Trp Pro Arg II. M.t Pro A.p Cly l.y. A.n lU A.n Cln ly Cly v»l A.p Ph. 

JOl TAC AAC AfiA etc CTT CAT CAC CTT TTC AAO AAT OAT ATC ATA CCA TTC CTA ACA CTC TAT 
III Jin vH A.P Clu L.U L.a Ly, A.n A« 11. U. Pro Ph. v.l Thr ..u Tyr 

m CACT<WCACTTACCCTACtXACrrTATGAAAAAOCT«y.T«CTTAACCCACATATACCC « 
l" Hi. Trp A.P UU Pto Tyr AU Uu Tyr Clu ly. Oly Oly Trp L.u A.n Pro A.p li. AU U 

411 CTC TAT TTC ACA OCA TAC CCA ACO TTT ATC TTC AAC SAA CTC CCT CAT CCT CTC AAA CAT 
m ^ in AT, AU Tyr AU Thr Ph. h.t Ph. A.n Clu Uu Cly A.p Ar, v.l ty. H.. 

- ^^n;sn^i;^hStc?«^^i^n:7hnrrci:?ji^ris^?:^:u$" 
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-^yS^«ctijr^^^crh^r.ir.=A:::nru^^?c:srut^:ru:,- 
Ti 'TCit^n:^ro^u:j:^r.-^^«h^:-:rcr."s;t^in^ur„ 

^-:r[st:rp:::?crA-:r.:t:-n:cu--r.^-=A!:^ r 
" - ^ r t^: rp ^ z z z z z z z :fp ™ :r. r/. . iJr 

- =ct: ^p Ai= z z z I- -zzz si: z z z 

z z z z z z !n ^p z r:: tr. iir 
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1 TTC ATA Acc rrr ccr cat tat ttc m: ttt a-.A At 'a u.t Af -A tca tcg cac cm: A Vf cAt; fio 

1 Met rip Arg Ph« Pro Asp Tyr Pht Leu Phf r.\y Thr AU Thr Scr Stfr Mik f;ln !)<• <nii ;;f) 

61 ccr AAT AAC ATA TTT MT CAT TCC TCC CAC TGC CAC ACT AAA CCC ACC ATT AAC rrrc. ACA l .'O 

21 Cly A«n Asn He Phe A*n Asp Trp Trp Clu Trp Clu Thr Lys Gly Arg lie Lys V* 1 Arq 



301 COG ATA CAA CCT CTA ATC ACT CTT CAC CAC TTC ACA AAC CCC CAA TOG TIT ATC AAA ATT 

lOi Cly Il« Clu Pro V»l II* Thr L«u Hig Hi« Ph« Thr Ajn Pro Cln Trp Ph* «et Lys He 

361 GGT CCA TCG ACT ACC CAA CAC AAC ATA AAA TAT TTT ATA AAA TAT CTA CAA CTT ATA CCT 

121 Gly Cly Trp Thr Atb Clu Clu Asn Jl« Ly» Tyr Phe 11« Lys Tyr Val Clu Ltu Ue AU 

421 TCC CAC ATA AAA CAC CTC AAA ATA TOO ATC ACT ATT AAT CAA CCA ATA ATA TAT CTT TTA 

141 Ser Clu lie Lys Asp V»I Lys II* Trp lie Thr lis Asn Clu Pro He He TfT Val Leu 

481 CAA CCA TAT ATT TCC CCC CAA TCC CCA CCT CCA ATT AAA AAT TTA AAA ATA CCT CAT CA.\ 

161 Gin Cly Tyr II* Ser Cly Clu Trp Pro Pro Cly He Lys Asn Leu Lys 21* Ala Asp Cln 

541 CTA ACT AAC AAT CTT TTA AAA OCA CAT AAT CAA CCC TAT AAT ATA CTT CAT AAA CAC CCT 

lai Val Thr Lys A*n L«u Leu Lys Al* Hi* Asn Clu Als Tyr Asn II* L*u Kxs Lys His Cly 

601 ATT CTA CCC ATA CCT AAA AAC ATC ATA CCA TTT AAA CCA CCA TCT AAT ACA CCA AAA CAC 

201 II* Val Cly II* AU Lys Asa H«c 11* Al* ?h* Lys Pro Gly Ser Asn Arsr Gly Lys Asp 

661 ATT AAT ATT TAT CAT AAA CTC CAT AAA GCA TTC AAC TCC CCA TTT CTC AAC CCA ATA TTA 

221 H* A«n II* Tyr His Lys Val Asp Lys Al* Ph« Asn Trp Cly Ph« Leu Asn Cly II* teu 

721 ACC CCA CAA CTA CAA ACT CTC CCT CCA AAA TAC CCA CTT CAC CCC CCA AAT ATT CAT TTC 

:41 Arg Cly Clu L*u Clu Thr Leu Arg Cly Ly* Tyr Ary Val Clu Pro Cly Asn II* Asp Ph« 

7B1 ATA CCC ATA AAC TAT TAT TCA TCA TAT ATT CTA AAA TAT ACT TOO AAT CCT TTT AAA CTA 

261 II* Gly 11* Asn Tyr Tyr Ser Ser Tyr II* Val Lys Tyr Thr Trp Asn Pro Phe Lys L*u 

841 CAT ATT AAA CTC CAA CCA TTA CAT ACA CCT CTA TCC ACA ACT ATC CCT TAC TCC ATA TAT 

2B1 His II* Lys Val Clu Pro Leu Asp Thr Gly Leu Trp Thr Thr M«t Gly Tyr Cys He Tyr 

901 CCT ACA CCA ATA TAT CAA CTT CTA ATC AAA ACT CAT CAC AAA TAC CCC AAA CAA ATA ATC 

301 Pro Arg Gly H* Tyr Clu Val Val Met Lys Thr His Clu Lys Tyr Gly Lys Clu H« He 

961 ATT ACA GAG AAC CCT CTT CCA CTA CAA AAT CAT CAA TTA ACC ATT TTA TCC ATT ATC ACC 



-to 



121 TCC CCT AAC CCA TCT AAT CAT TCC CAA CTC TAT AAA CAA CAC ATA GAG CTT ATC CCT CAC 180 
41 Ser Cly Lys Ala Cys Asn His Trp Clu Leu Tyr Lys Clu Asp He Clu L«g Met Ala Glu 



60 



181 CTC CCA TAT AAT CCT TAT ACC TTC TCC ATA CAC TCC ACT ACA ATA TTT CCC ACA AAA CAT 240 
SI Leu Gly Tyr Asn Ala Tyr Arg Phe S«r He Clu Trp Ser Arg He Phe Pro Arg Lys Asp 



80 



241 CAT ATA CAT TAT GAG TCC CTT AAT AAC TAT AAC CAA ATA GTT AAT CTA CTT ACA AAA TAC 300 
81 His II* Asp Tyr Clu Ser l^u Asn Lys Tyr Lys Clu He Val Asn Leu Leu Arg Lys Tyr 100 



360 
120 



420 
140 



480 

160 



540 
IBO 



600 
200 



660 
220 



720 
240 



780 
260 



640 

280 



900 
300 



960 
320 



1O20 



1080 
360 



321 II* Thr Glu Asn Gly Val Ala Val Clu Asn Asp Clu L*u Arg He L*u Ser He He Arg 340 

1021 CAC TTA CAA TAC TTA TAT AAA CCC ATC AAT CAA CCA CCA AAC CTC AAA CCA TAT TTC TAC 

341 His L*u Cln Tyr Leu Tyr Lys Ala H«t Asn CU Cly Al* Lys Val Lys Cly Tyr Phe Tyr 

1081 TCC ACC TTC ATC CAT AAT TTT CAC TCC CAT AAA CCA TTT AAC CAA ACC TTC CCA CTA CTA 

361 Trp ser Phe Met Asp Asn Phe Clu Trp Asp Lys Cly Phe Asn Cln Arg Phe Cly Leu Val 

IHl CAA CTT CAT TAT AAC ACT TTT CAC ACA AAA CCT AGA AAA ACC CCA TAT CTA TAT ACT CAA 

381 Clu Vai Asp Tyr Lys Thr Phe Ctu Arg Lys Pro Arg Lys Ser Ala Tyr Vai Tyr Ser Cln 

rjOV ATA CCA CCT ACC AAC ACT ATA ACT c;aT CAA TAC CTA CAA AAA TAT CCA TTA AAC AAC CTC 

401 rie Ala Arg Thr Lys Thr lU Ser Asp CU Tyr Leu Clu Lys Tyf Cly Leu Lys Asn Leu 

1^61 CAA TAA J266 

42 1 Clu Ciid 422 
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ATCv CTJ^ CCA CAA GGC TTT CTC TO? CCC CTC tCC CAf. TCC COC TTT CXC TTC OXC AtC CCC 

net L«u Pro CU cly yh« L-u Trp Cly v»l *ef CVn 4«r Cly Ph« Cln Ph« Oiu HtC Oiy 

CAC AAC CTC Aft; AflC AAC ATT OAT CCC AAV At> CAC TCC TCC AAC TCC (TTC ACC CAT CCC 

X«p Ly» t.«u Any Ai-g A*n lU Ajrp rro Ajn Tbr Ajp Trp Tip Lyi Trp v*l Mxg Atp Pro 



3Q1 CW GAC ASC TAC CCWl CTC CTC MC CAC AAA ATC GAT AAA GAC ACS CTC GAA CTC 

101 Ar9 A*p S.r Tyr Cly Uu V«l Ly» A*c V«l X«» Ly« A*P Thr L«u Clu Cly Uu 

3<1 CAC GAG ATA CCC AAT CAT CAC CAC ATA CCC TW: T*C CCC COC CTT ATA GAO SWT CTC ACC 

U: A#p Clu :u Xi* A^s Hi. cm CU r-* ax* TVt Tyr Ara Ar» V*a II* Clu Hi* Utt Ax^r 

431 CAC CTC CCC rrC JJU; etc ATC CTC AAC CTC AAC CAC rrC ACG CTC CCC CTC TOC CTT CAC 

141 Clu L«u Cly ?h» Ljr» V4I tU V«i Aan L«u A*n HiJ ?bv rhrhmx iro l^u Trp Uu Ki. 

4il CAT CCC ATA ATC WC ACC GAC AAC CCC CTC ACC AAC CCT ACC ATT OCX! T« CTC C^ CAC 

161 A*p Pro Ut rU Xia Arg Clu Ly« Thr Aa« Cly Xtb IU Cly Trp Cly Cln 

341 CAC JUX CW CrC CAC TTC CCC AAC TAC CCO OCO TAC ATC CCO AAC CCA CTC C« GW: CTC 

m ^ vll V*I Clu Phi Al* Tyr XU AlA Tyx II. aIa A.n Ai* L.u Cly W Uu 



60 
JO 



120 
40 



CTC CTC ACC- C/JC CAC CTU CCC CAC CAC CCC ATA XAC AAC TAC 110 

60 



12 L TTC AAC ATA AAC AOG CAA _ _ 
<I Phm A«n lit -y« ATfl Clu ««u Val S«r nly Aflp l.*u 7ro Clu Clu Cly tl« A»a Mo Tyr 

181 CAA CTT TAC CAC AAC CAT CAC CCC CTC UCC AUA CAC CTC CCT CTC AAC CTT TAC ACC ATT 
$1 Clu L«u Tyr Clu JLyi Ajp Ki* AT? I-eu A-i Arg A*p i,tu Cly 1-eu A«u V»l Tyc Arg lie 

2<: CCA ATA CAC TCC ACC ACC ATC TTT CCC TCC CCA ACC TOG TTT GTG CAC CTT CAC (?rr 

ai Cly rl« Clu Trp 5.x Arg lit Ph. no Trp rro Thr Trp ?&• v*l clu v*l A«p v«l clu 100 

360 
120 



2&0 



JOO 



430 
140 



480 

i<0 



5iQ 
180 



500 
200 



230 



120 
240 



760 



iOl err CAT ATG T«! ACC ACC TTC AAC CAC CCC Arc CTC crj etc CAC CrC CS^ ffO 
201 ASP Xec TTp S*r Thx Ph. A.a 5iu rrD v*l v.l vmI Clu L« Cly r>T Al* 

«Bi CCC -AC TCC GGC TTT ceo ceo occ CT^ ATC AAC CCC CAC Gca AAo CTC craw ATn: 

m pro ^ Pho pro pro Cly v*l «t a« Pro Clu Al. Al* Ly. i*« AI* lU Leu 

121 AAC A-n: ATA AAC CCC CAC CCA CX CCC nC AAC ATO ATA AAa AJ^J TTC GAC AOO CTA AAA 
24i lli i« S; AU ueu AU tyr .y« H« Uyn Ly. Ph. A^ xxg .val .y. 260 

711 CCCCAt AAOGATTCCCCKTCCCAdCCCOACCTCCCCATAAXCTAC JJO 

2U AU A5P :.y» A5? Ux xro s«r Clu Ala Clu Cly lU 11» Tyr Att A-n 11. Cly Vai 

.41 G-r TAT CCA -AC C«: TCC AAC SAC CCA AAS CAC CTC AAA CCT CCA CAX AAC CAC AAC TAC KO 

ail' aS I" i5 i« A*n A.P rro ty. A.p val Ly. Al. Al. Clu A«. A-p A« tyr 300 

9«0 
3J0 



1050 



tOl TTC CAC XGC 50= CTC TTC nC (»C SC* ATC CK- A« 60C WC CIC MC WC CXfl TTC CXC 

.41 C-T CAC ACC rrc CTC AX» Sn «S CAT Cte AM C« *AC a*C TOS AW «W OTT AAC lAC 

!!i cly ct^ "J l-y* V.i AT, Ki. ..u aiv A.n A.6 T« 11. Qly V.l A« Tyr J«0 

•O'l TAC AO: ABA OK BTC AGO TAT TCC (»C ax AAC rrC C« MC m CCC TO m wso 
JU AX9 eiu V.1 AT, Tyx S.r CJu ?r» Lr» Fro Sex Il« Jro l..u II. s.r 350 



f. 25 ;i: SI s ;n s; s is s ;r. 25 1?. t. s 



1300 

1141 ACC CCC CTTA ACC GAC ATC CCC TGC a,, .a. ... ^^^-^ 
191 Arc rro Val Sec 

W03 CAC CCC A^ AAA TAC COC CTC CCC CT. T^C OTC ACC CAA AAC CCA ATX CCC CAT TCA ACT V.JO 

1320 



401 Clu Al 
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241 CAC TCC CAC CTT TAC ACC CAA CAT ATA CAC CTA ATC CCA CAC CTC CCC TAC AAT CCC TAC 
b\ Hi. Trp Clu Uu ryr Ar, Clu A.p lU Clu L.u H.t AU Cln t.u Cly Tyr A.n AU Tyr 



120 
40 



60 



)00 
100 



.n, r«- TCC X-A CAC TCC ACC CCT CTC TTC CCC CAA CAC CCC AAA TTC AAT CAA CAA CCC 3S0 

III li. Z S.r Ar, L.U Pn. Pro Cl« Clu Cly Ly. Ph. A.n Cl. Clu AU 120 

3fil rrc AAC CCC TAC CCT CAA ATA ATT GAA ATC CTC CTT CAC AAC CCC ATT ACT CCA AAC CTT 

III 7^.1 ^1^9 Clu 11. XX. Glu rl. L.U Clu W» Cly 11. Thr Pro A3n V.l 

421 ACA CTQ CAC CAC TTC ACA TCA CO; CTTI TOG rX ATQ ceo AAC CCA CCC m TTC AAC CM 

UX HU Ph. Thr S.r Pro Uu Trp M.c Arg Ly. Cly Cly Ph. Leu Ly. Clu 

s:=s;s:s::nss:s«':ss:.=nsitss;sss it. 



420 
140 

160 



900 
300 



^ „ .ftx CCA CCC rrr CCC CTC CTC CAC CTC cac tac acc acc ijjo 
r. ^ Z ^.ZZZ Z Z r. ..u v.. v.. ... rrr ... «o 

42) Ph« Ly» Arfl Ars Pro Arg Ly< S.r Al. Tyr ii. 

1:21 ATA AAA CAC CAA CTC CTX: CCA AAC TAT CCC CTT CCC CAC CTA TCA 136S 
44 1 lU Ly. A,p Clu Leu L.u Al. Ly. Tyr Cly L.u Pro Clu Leu find 455 



Figure 5 



wo 98/24799 



10/46 



PCT/US97y22623 



THERMOCOCCUS CHITONOPHACOS CLYCOSIDXSE - 22G 
COMPLETE SEOaZNCE - 9/9 5 

1 TTC err CCA CAC AAC TTT CTC TCC CCA CTT TCA CAC TCC CCA TTC CAC TTT CAA ATC f;cc 6 0 

1 Mec Leu Pro CU Asn Phe Leu Trp Cly Val Set Cln Sflr Cly Phe Gin Phe Ciu Met Cly 20 

61 CAC ACA CTC ACG ACG CAC ATT CAT CCA AAC ACA CAT ' TCC TGG TAC TCC GTA ACA CAT CAA 120 

21 ASP Arg U«u Arg Arg His Il» Aap Pro Asn Thr Asp Trp Trp Tyr Trp Val Arg Asp Clu 40 

121 TAT AAT ATC AAA AAA CCA CTA CTA ACT CCC CAT CTT CCC CAA CAC CCT ATA AAT TCA TAT 180 

41 Tyr Asn XU Lys Ly* Cly Itu Val S«r Cly Asp L«u Pro Clu A*p Cly He Asn Ser Tyr 60 

IBl CAA TTA TAT CAC ACA CAC CAA CAA ATT CCA AAC CAT TTA CCC CTC AAC ACA TAT ACG ATC 240 

61 Clu Leu Tyr Clu Arg Asp Gin Clu lie Ala Lys Asp Leu Cly Leu Asn Thr Tyr Atg lie 80 

241 CCA ATT CAA TCC ACC AGA CTA TTT CCA TCC CCA ACG ACT TTT CTC CAC CTC CAC TAT CAA 300 

Bl Cly lie Clu Trp Ser Arg V»l Phe Pro Trp Pro Thr Thr Phe Vai Asp v*l Clu Tyr Ciu 100 

301 ATT CAT GAG TCT TAC CCC TTC CTA AAC CAT CTC AAC ATT TCT AAA CAC CCA TTA CAA AAA 3 60 

101 He Asp Clu £er Tyr Cly Leu Val Lys Asp V&l Lys He Ser Lys Asp Ala Lau Clu Lys 120 

361 CTT CAT CAA ATC CCT AAC CAA ACC CAA ATA ATA TAT TAT ACG AAC CTA ATA AAT TCC CTA 420 

121 Leu Asp Glu He Ala Asn GXn Arg Clu lit He Tyr Tyr Arg Asn Leu He Asn Ser Leu 14 0 

421 AGA AAO AGC OCT TTT AAC CTA ATA CTA AAC CTA AAT CAT TTT ACC CTC CCA ATA TCC CTT 480 

141 Arg Lys Arg Cly Phe Lys V&l He Leu Asn Leu Asn His Phe Thr Leu Pro He Trp Leu 160 

481 CAT GAT CCT ATC CAA TCT AGA CAA AAA CCC CTG ACC AAT AAC ACA AAC CGA TGG CTA ACC 540 

161 His AAp Pro He Clu Ser Arg Clu Lys Ala Leu Thr Asn Lys Arg Asn Cly Trp Val Ser 180 

541 CAA ACG ACT err ATA CAC TTT CCX AAA TTT CCC CCG TAT TTA CCA TAT AAA TTC CGA CAC 600 

lai Clu Arg Ser Val He Clu Phe Ala Lys Phe Ala Ala Tyr Leu Aia Tyr Lys Phe Cly Asp 200 

601 ATA CTA GAC ATC TCC ACC ACA TTT AAT (5AA CCT ATC CTC CTC CCC CAC TTG CCG TAT TTA 660 

201 He Val Asp Mec Trp Ser Thr Phe Asn Ciu Pro Hec Val Val Ala Clu Leu Cly Tyr Leu 220 

661 CCC CCA TAC TCA CGA TTC CCC CCC CCA CTC ATC AAT CCA CAA CCA CCA AAC TTA CTT ATC 720 

221 Ala Pro Tyr Ser Cly Phe Pro Pro Cly Val Met Asn Pro Clu Ala Ala Lys Leu Val Hec 240 

721 CTA CAT ATC ATA AAC CCC CAT CCT TTA CCA TAT ACC ATC ATA AAC AAA TTT CAC ACA AAA 780 

241 Leu Him Mec He Asn Ala His Ala Leu Ala Tyr Arg Mec He Lys Lys phe Asp Arg Lys 2«0 

781 AAA CCT CAT CCA CAA TCA AAA CAA CCA OCT GAA ATA CCA AT? ATA TAC AAT AAC ATC CCC 940 

261 Lys Ala Asp Pro Clu Ser Lys Clu Pro Ala Clu He Cly He He Tyr Asn Asn He Cly 280 

841 CTC ACA TAT CCC TTT AAT CCC AAA GAC TCA AAC CAT CTA CAA CCA TCC CAT AAT CCC AAT 900 

281 Val Thr Tyr Pro Phe Asn Pro Lys Asp Ser Lys Asp Leu Cln Ala Ser Asp Asn Ala Asn 300 

901 TTC TTC CAC ACT CCC CTA TTC TTA ACG CCT ATC CAC ACG CGA AAA TTA AAT ATC GAA TTT 960 

301 Phe Phe His Ser Cly Leu ?hft Leu Thr Ala He His Arg Cly Lys Leu Asn He Glu Phe 320 

961 GAC CGA CAC ACA TTT CTT TAC CTT CCA TAT TTA AAC CCC AAT CAT TCC CTC CGA CTC AAT 1020 

321 Asp Cly Clu Thr Phe Val Tyr Leu Pro Tyr Leu Lys Cly Asn Asp Trp Leu Cly Val Asn 340 

1021 TAT TAT ACA AGA CAA CTC CTT AAA TAC CAA CAT CCC ATC TTT CCA ACT ATC CCT CTC ATA 1080 

341 Tyr Tyr Thr Arg Clu Val Val Lys Tyr Cln Asp Pro Mec Phe Pro Ser He Pro Leu He 360 

1081 ACC TTC AAC CCC CTT CCA CAT TAT CGA TAC CCA TCT ACA CCA CCA ACC ACC TCA AAC CAC U40 

361 Ser Phe Lys Cly Val Pro Asp Tyr Cly Tyr Cly Cys Arg Pro Cly Thr Thr Ser Lys Asp 360 

1141 CCT AAT CCT CTT ACT CAC ATT CCA TCC GAG CTA TAT CCC AAA CCC ATC TAC CAC TCT ATA 1200 

381 Cly Asn Pro Val Ser Asp He Cly Trp Clu Val Tyr Pro Lys Cly Mec Tyr Asp Ser He 400 

1201 CTA CCT CCC AAT CAA TAT CGA CTT CCT CTA TAC CTA ACA CAA AAC CCA ATA CCA CAT TCA 1260 

401 Val Ala Ala Asn Glu Tyr Cly Val Pro Val Tyr Val Thr Clu Asn Cly He Ala Asp Ser 420 

12&1 AAA CAT CTA TTA AGC CCC TAT TAC ATC CCA TCT CAC ATT CAA CCC ATC CAA CAC CCT TAC 1320 

421 Lys Asp Vai Leu Arg Pro Tyr Tyr He AU Ser His lie Clu Ala Met Glu Clu Ala Tyr 440 
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PTKOC0CC04 rVRJOSOi QS.7CO«20AJ2 - 701 
CQiaUTX GSNI iMQVJDtCX - ;0/95 

1 XTC TTC CCT e^AA A>.G TTC CTT TCG CCT CTC GCX CAA TCG OGT TTT CAC TTT &AA ATC. GCC £0 

1 Met Phe Pro Cli: Lya Phe Leu Trp Gly V*: AJa Gin 5er Cly Phe Gin Phe Clu Het CI/ :o 

61 GAT AAA CTC ACG ACG AAT ATT GAG ACT AAC ACT GA? TGG TCG CAC TCG CTA AGC CAT AAC 120 

2: Ajp lys Leu Arg Arg Aan Ua Ajp Thr Ajn Thr Asp Trp Trp His Trp Val Ar? Aap Lya 4Q 

12] ACA AAT ATA GAG AAA GCC CTC CTT AGT GGA GAT CTT CCC GAG GAG CGC ATT AAC AAT TAC 160 

41 Thr Aan :i^ GIu Lya Gly Leu V»l Ser Cly Asp Leu Pro Glu Glu Gly Ue Asn Asn Tyr 60 

181 GAG CTT TAT CAC AAC GAC CAT GAG ATT CCA AGA AAC CTC COT CTT AAT CCT TAC AGA ATA 240 

61 Clu L«u Tyr Gi*j Lya Asp His Ciu lie Ala Axg Lys Leu Cly Leu Asn Ala Tyr Arg lie 80 

241 CGC ATA GAG TGG AGC AGA ATA TTC CCA TGG CCA ACG ACA TTT ATT GAT CTT GAT TAT AGC 300 

ai Giy He Glu Trp S«r Arg n« Pho Pro Trp ?ro Thr Thr Phe lie Asp Val Asp Tyr Ser 100 

301 TAT AAT GAA TCA TAT AAC CTT ATA GAA GAT GTA AAG AIC ACC AAG GAC ACT TTC GAC GAC 360 

101 Tyr Aan Glu Str Tyr Aan L«u Il« Clu Asp Val Lys He Thr Lys Aap Thr Leu Clu Glu 120 

361 TTA CAT GAC ATC CCC AAC AAC ACG GAG CTC CCC TAC TAT ACG TCA CTC ATA AAC AGC CTC 420 

121 Leu Asp Glu Il« Ala Asn Lys Arg Glu Val Ala Tyr Tyr Arg Ser Val He Asn Ser Leu HO 

4 21 AGC ACC AAG GCG TTT AAG CTT ATA GTT AAT CTA AAT CAC TTC ACC CTT CCA TAT TGG TTC 480 

i;; Ar? $tz Lys Cly Jht Lys Vil Il» V*! ^^7\ Lau A:n VSs Phe Thr I^u Pro Tyr Trp Leu 160 

4 8: CAT GAT CCC ATT GAC GCT AGC GAG ACG CCC TTA ACT AAT AAC AGG AAC GCC TCG CTT AAC 540 
161 Kis Aap Pro He Glu Ala Arg Glu Arg Ala Leu Ihr Asn Lys Axg Asn Cly Trp Val Asn 180 

5 4: CCA AGA ACA CTT ATA GAG TTT CCA AAG TAT GCC GCT TAC ATA GCC TAT AAG TTT CGA GAT 500 
131 Pro Arg Thr V«l ll« Clu Phe Ala Lys Tyr Ala Ala Tyr He Ala Tyr Lys Phe Gly Asp 200 

6C1 ATA GTG GAT ATG TCG ACC ACG TTT AAT GAG CCT ATC CTC GTT GTT GAG CTT CCC TAC CTA 660 

201 He Va.1 Asp Met Trp Ser Thr Phe Asn Glu Pro ««c Val Val Val Glu Leu Gly Tyr Leu 22 0 

661 CCC CCC tAC TCT CCC TTC CCT CCA CCG GTT CTA AAT CCA GAG GCC CCA AAG CTC CCC ATA 720 

221 Ala Pro Tyr 5«r Cly Phe Pro Pro Gly Val Leu Asn Pro Clu Ala Al* Lys Leu Al* 2le 240 

721 CTT CAC ATG ATA AAT CCA CAT CCT TTA GCT TAT AGG CAG ATA AAC AAG TTT' GAC ACT CAC 780 

241 Leu His Met He Asn Al» His Ala L«u Ala Tyr Arg Gin He Lys Lys Phe Asp Thr Clu 260 

781 AAA CCT CAT AAC GAT TCT AAA GAC CCT CCA GAA CTT CCT ATA ATT TAC AAC AAC ATT CGA 840 

261 Lya Ala Asp Lys Asp Ser Lys Clu Pro Ala Clu Val Cly He He Tyr Asn Asn He Cly 280 

841 GTT GCT tAT CCC AAG GAT CCG AAC GAT TCC AAG GAT CTT AAC CCA CCA GAA AAC GAC AAC 900 

261 Val Ala Tyr Pro Lys Aso Pro Asn Asp Ser Lys Asp Val Lys Al* Ala Glu Asn Asp Asn 300 

901 TTC TTC CAC TCA GCG C7G TTC TTC GAC CCC ATA CAC AAA GGA AAA CTT AAT ATA GAC TTT 960 

301 Phe Phe Has 3cr Cly Leu Phe Phe Clu Ala He His Lys Gly Lys Leu Asn He Glu Phe 320 

961 CAC CCT GAA ACC TTT ATA GAT CCC CCC TAT CTA AAC CGC AAT GAC TCC ATA GGG GTT AAT 1020 

321 Asp Gly Glu Thr Phe He Asp Ala Pro Tyr Leu Lys Cly Asn Aap Trp He Cly Val Asn 340 

1021 TAC TAC ACA ACC GAA CTA GTT ACG TAT CAC GAA CCA ATG TTT CCT TCA ATC CCC CTC ATC 1080 

34 1 Tyr Tyc Thr Arg Clu v«l Val Thr Tyr Gin Clu Pro K«t Phe Pro Ser He Pro Leu 11c 360 

lOei ACC TTT AAG CGA GTT CAA CGA TAT CCC TAT GCC TCC AGA CCT GGA ACT CTC TCA AAC GAT lUO 

351 Thr Phe Lya Cly Val Gin Gly Tyr Cly Tyr Ala Cys Arg Pro Gly Thr Leu Ser Lys Aap 3*0 

1141 GAC AGA CCC CTC AiGC GAC ATA CGA TCG GAA CTC TAT CCA GAG GCG ATC TAC GAT TCA ATA 120C 

381 Asp Arg Pro Val Ser Asp U* Gly Trp Clu Leu Tyr Pro Glu Cly Met Tyc Asp Ser He 400 

1201 CTT GAA GCT CAC AAC TAC CGC CTT CCA CTT TAC GTG ACC GAG AAC CGA ATA GCG CAT TCA l2fe0 

401 Val Clu Aia His Lys Tyr Cly Val Pro Val Tyr Val Thr Ciu Asn Gly He Al* Asp Ser 420 



Figure Sal. 
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1251 AAC GAC XTC CTA ACA 
421 Lya Aap il» itu Arg 

1321 GAG GAT CCC TAT CAA 

441 Clu Ajp cly T/r Clu 

1381 CCT CTC CCC TTT ACA 

4 61 AIa Lcu Gly Phe Axg 

MU ATT CCC AGO GAG AAG 

481 He Pro Arg Giu Lya 

1501 AAA AAG ATT GAA GAG 

501 Lya Lyj rie Giu GJu 



CCT TAC 

Pro Tyr 

CTT AAG 
V4i Lya 

ATC CCC 
Met Axg 

AGC CTC 

GAA TTC 
Slu Leu 



TAC ATA CCC AGC CAC ATA AAG ATC ATA r^r 

Tyr a. 5.r Hi. ne t," JI? JJ^" 

CCC. TAC TTC CAC TOC CCA TTA ACT GAC AAC TTC CAC TCC 
Cly Ty, Ph. HI. Trp *U ..a Thr A.p ^= 

TTX CCC CTC TAC CAA CTC AAC CTA ATT ACA AAC Cir ir. 
Ph. Cly L.U Tyr Giu v.l A.n ^eu nl ^= ^% 

TCC ATA TTC AGA (iM ATA CTA CCC AAT AA- err rT, .^^ 

s.r n. Ph. AT, Clu II. v.i Jll J^; en 



ij:o 

440 

1380 
460 

1440 
iBO 

1500 
500 



CTG AGG GGA TGA 
Leu Arg GLy End 



1533 
5U 



figure 8b(Continued) 
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9 IB 27 36 45 5-1 

5 ' ATG ACA ATA CGT TTA GCG ACO CTC GCO CTC TCC CCA CCG CTG AGC CCA CTC ACC 
Mot Ary II© Axff Leu Ala ThX Uu Ala Leu Cya Ala Ala L«u Ser Pro Val Thr 

63 72 Bl 90 99 108 

TTT CCA GAT XXT GTA ACC CTA CAA XTC CAC GCC Cy^C GCC GGT AAA AAA CTC ATC 
Ph« Ala Ajp Kan Val Thr Vol Ola lie Aap Ala Aap Cly Cly Ly» Lyc Lou lie 

1X7 126 135 144 153 152 

AGC CGA GCC CTT TAC GGC ATG AAT AAC TCC AAC CCA CAA ACC CTT ACC GAT ACT 
Ser Arg Ala L«u Tyr Gly Mec Aan Aan Sar Asn Ala Glu Ser Leu Thx Aap Thr 

171 100 189 198 207 216 

GAG TOG CAC CC5T TTT CCC GAT GCA COT GTG CGC ATG CTG CGG GAA AAT GGC GCC 
Asp tr7 Gin Arg Phe Arg Asp Ala Gly Val Arg Hat Lau Arg Glu Aaa Gly Gly 

225 334 343 252 261 270 

AAC AAC AGC ACC AAA TAT AAC TCC CAA CTG CAC CTC ACC ACT CAT CCG GAT TGG 
Aan Atfti Ser. Tbr Lys Tyr Asa Trp Gla Leu His Lru Ser Ser Hia Pro Asp Trp 

279 2aa 297 306 315 324 

TAC AAC AAT CTC TAC GCC CGC AAC AXC AAC TGG GAC AAC CGG GTA GCC CTG ATT 
Tyr Aan Aan Val Tyr Ala Gly Aan Aan Aan Trp Aap Aan Arg Val Ala I^m II a 

333 342 351 360 369 378 

CAG GAA AAC CTG CCC GGC CCC GAC ACC ATG TGG CCA TTC CAC CTC ATC CGT AAC 
Gin Glu Aan Leu Pro Gly Ala Aap Tbr Uet Trp Ala Phe Gin L«u lie Gly Lys 

3S7 396 405 414 423 432 

GTC GCQ GCG ACT TCT GCC TAC AAC TTT AAC GAT TGG GAA TTC AAC CAG TCG CAA 

Val Ala Ala Thr Ser Ala Tyr Asc Asp T-rp Glu Pha Asn Gin S«r Gin 

441 450 459 468 477 466 

TCG TGG ACC CGC GTC GCT CAG AAT CTC OCT CGC CGC GOT GAA CCC AAT CTG GAC 

Trp Trp Thr Gly Val Ala Gin Aan Leu Ala Cly Gly Gly Glu Pro Aan Leu Aap 

495 504 513 522 531 540 

GGC GGC GGC GAA GCG CTG GTT GAA GGA GAC CCC AAT CTC TAC CTC ATG GAT TCG 
Gly Gly Gly Glu Ala Leu Val Glu Cly Aap Pro Asn Leu Tyr Leu Hat Asp Trp 

549 558 567 576 585 594 

TCG CCA GCC GAC ACT CTG GGT ATP CTC GAC CAC TGG TTT GGC (TTA AAC 0C3G CTC 
Ser Pro Ala Aep Thr Val Gly He Leu Asp Hia Trp Phe Gly Val Aan Gly Leu 

603 612 621 630 639 64B 

OGC CTG CGG CGT GGC AAA QCC AAA TAC TGO AGT ATG GAT AAC GAG CCC GGC ATC 
Gly Val Arg Arg Gly Lys Ala Lys Tyr Trp Ser Met Aap Aan Glu Pro Gly He 

657 666 675 684 693 702 

TGG CTT CGC ACC CAC GAC GAT GTA CTO AAA CAA CAA ACC CCC GTA GAA GAT TTC 
Trp Val Oly Tbr Hia A»p Aap Val Val Lya Glu Gin Thr Pro Val Glu Acp Phe 



Figure 9a. 
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CTG CAC ACC TAT TTC 2a ACC QCC 2^ i»» 756 

U»u Hi, Thr Tyr S ^ '^T '"^ 

Ai« Aro Ala Lyo Pho Pro Ciy Uq 

774 7B3 



AAA ATC . ACC GOT CCC CTC CCC CCT .-..o "'^^ 310 

n. ^ ,„ - s: - s 0^ s ST s - i 

828 

to: TCC GTA CCC CAC GAA rjUl fy=r -T-nm gfiii 

s„ V. s; s: s s s - ii-^ ™ s ^ 

"3 883 

K» GTC TCT GWl GAC CAA CCM CCA ACT fiTT 318 

vaa SC. oxu 0. a. S S J2 0%%^ S 2^ - - - 



,,5 S54 



CTQ CAC TAC TAC CCC CGC CCT TAC »JL^ ^. 972 

- K.. ^ ,„ s 2: s v-s s SI 

. ACB TOC TTC CSAC CGC GAC m> fTTT. !I^f ^ ^, ^""^ 1026 

- ... p». S5 SI 51 s S 2; S S K ;n v=I^ 

1035 1044 xoc-, 

«A GOT GGC TOO GAT cac AOC ATC AAC AAfi r^. ltl 

^ S ^oj; m ,„« ^ e« 

lOas 1098 1107 ,,,, 

OAT roc CTC C»C CAA TAT ATO COG CCA GAC CXT^ rn,. 

Tx. ^ s es s s 

1143 1151 lifli 

ATO TGC OTtJ CGC JUT GTO AAT CCG ATT McJll^ 

... c„ v.. ^ ^ 1' - - - - JJC J^, ^ ^ 

"OS 1215 
ATG CTC GGC ACC TIC OCG GAT AAC COC CTC Oxx llt ^ 

1251 1360 1269 
AAC ACC GGA ATO TOO GAA ACA CTC CAC cnr- -rJlli 

1305 1314 1323 

CCGCTCGCCTCCAGCTTXACTCTTGAAGAn T,^ "50 

Ara V.1 Ala Ser S« Ser ^ SI g^ SI S Sa J!!^ T 

rne vai ser Ala Tyr Ser Scr lie 

13fi8 1377 

AAC GAA CCA GAA GAG CCC ATO ACG rrx ^ 1<04 

- „i in s Ji: j:: i;: SI s J2 ^ 

Figure 9t(Continued) 
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B«iiki. gouldl .Bdogi«c*a«.. t37api) (coatla««d) 

ACC OLc'i« CCC ACt'JJ^ OCt AKSi CAT TO^JJ CTO G^r*£ CCC TAc'^^ 
Thr Ki, Thr Al. ,^ V.I Aa. He A.p A.p P.. p„ u.u ^ S Sr 

1475 1485 1494 ii;ni 

1584 1593 1402 ifin 

Leu Pro Pro L« S«r V.l ftr Ala II. Leu L« Ly. ai« ' 



Figure 9a (Continued) 
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or/a marlLma ^lT^^<,A\nc^.osi^i^r 



5* CTC ATC TCT rrrr^ .i? ^7 



lU cv« V.I clu n. P.. c:. ™ ~; " - - 



" 72 81 



Ly. «u Ly, PH. v,: cm Ph, Ala CaT cTu* Z;.- nl -ni; ^ - 
^ i ^ *^ ^ cca ecu ^ or «c ^ err c« ^ 

Lys He ser Gly Ar, vai Ly, ciy ci"; i^^; Z^j j;;; 

Ala ^„ Clu Ly, vai vii 'a^ ^; ^ i^,' 

- § !f! !!! § !!5 ^ ^ ccc ^ 

v«i Ala Ph. ser Phe Lys p.^o' cii Ha i;;^ Aa^ j^;^ 

^hr Ala Ser Val V»l S^l 'f^ vll 'oil 'f^ 1^ Gl^ ^ 't^ ^ 
gtcc«Sa«acgaaaagtc^cgct 

Val Ala Clu Clu Cly Lys vll Gly Pha S« ^ ie ^3 ^ 

387 395 ... 

!!!!T??^?!?!??5?!f??^^f:i3™GCAT«:cn:oASTTrcoAT^ 

Phe Phe Al. Val clu Asp cly cIu 'l^ vll ',11 'r^ 'l^ Ghi ^ va 
<41 450 459 4CO 

Glu Pte A3P A3P Phe Val Pro i^a clu ^ro Z^u CII vll Z^il clu ii^ JZ 

<9S 504 sii c-^-j e-i^ 

™ !!5 !!: ^ 3Cf ^ ^ «^ «^ ^ ^vAc ^ 

1^ Pro Ueu l^u Clu Lys III clu ZIu Cly Zl Oil Zl 

■'^^ 558 5C7 57S sar 

-?^?!?"!:"!^^f^f^«^'^f^^T^*O^TO:TACCATTACrrc^ 

Arg val Pro Ly. ui. i-,"; ..v.; 'j; — 



Figure lOc- 
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603 612 621 630 6J9 6<e 

GAT CTC ACC TC3G CAA CAC ACV CX AAC AAC CTC A^G CTC OCG AAC AAT TTC CCC 



Asp L«u Thr Tro Glu Clu TKr L«u Lyn Asn Leu Lys i^u Ala Lys Ann Ph« pro 

657 666 675 684 693 702 

TTC GAG GTC TTC CAG ATA C»C GAC CCC TAC CAA AAC CAC AlA GGT OAC TCG CTC 

Ph« Glu Val Vhm Gin !!« Asp Asp Ala Tyr Glu Lys Asp Ue Cly A«p Trp Lou 

711 720 729 738 747 7S6 

CTCACAAGAGGAGACnrCCATCCCTGGAAGACAlOGCAAAA OTT ATA OCG GAA 

vll Th^ Arg GXy Asp Phe Pro S«r va Glu Glu Met AU lys Val lie Ala Glu 

765 774 783 792 801 810 

AAC(CTTICA3CCCGGGCAT^TC»ACCCCCCCGr:CAGTGTTTCTG^ 

A^ Gly siii lie vil Gly 111 lUr Ala Pro Ph. Scr Val Sex Glu Thr S«r 

819 82S B37 846 855 B64 

GATG7ATrcAACG?ACATCC0GACTC0™CTCAMGAAAAC 

A^ vZ Phe A^ gIu P^o A^ Val Val Lys Clu Aa Gly Glu Pro Lys 

a73 883 891 900 509 518 

ATOGCTTXCACAAACTOAACAAAAWmT^COCCTOMOT 

927 Q36 945 954 583 972 

CACCrrCTOAACTGGOTTICGATCICrrCTOOTOT 

01uv2'l^'^^'i^ Phe 1^ Phe S*r Leu Arg Lys h«t Gly ryr 

ofli 990 999 1008 1017 1026 

AGGTACTOAACATCGACTITCrCTrCGCCOTOrcOT 

l^'r^Phlll^ 111 Phe ^e Gly Ala Val Pro Gly Glu Ar? Lya 

101 S 1044 10S3 1062 1071 lOBO 

AAG AAC ATi ACA CCA ATT CAGGCGrrCAaxAAA^ATTGACACGATCA^ 

lyl III ne Gin III Ph» I^ 'i^vl <Jl^ ^ ^ ^y** 

inaq 1090 1107 1116 1134 

GCG GGA GAAGATTCrrrcATCOCCGATXI^CTCTCCCCTrOT 

Ma val c'lu '^^ IVr ?ul 'ill 'l^ GW '<^l Gly S« Pro Pro Ala 

Mil llb2 1161 1170 1179 1188 

crc ooA ctk: gac^ atx^ ago ata gga cct gac Acnr gcc ccg 

Va) cay vni Gly M«u A^g Ho Gly Pro A^ -n.x Ala Pro Phe Txx> Gly 

Figure 10i^( Continued) 
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cii'^T ATA cTcU: AAC cll'lcT OX CCt'cCA ACA TOc'cC^ CTG A^A^AAC CCC 

gIu illl Ue clu ilp Asn Cly Ai'I P^o Ila Ai. Aro Trp I-u Arg A« Al. 

1260 1JS9 U7B "87 X2S5 

ATA ACT MG OCTTCATCCACGACAOGTOTOOTAACGACCaaClOT 

lie Tte »^ Ty^ Phi wit iilc '*9 "Ts^ 

1105 1314 1323 1341 1350 

ATA CTC^ aujGWAAAACCCATCICACACAGAMGWAMGWCTC 

nl 'iZ Olu olu lyl^'f^'lZ^'di^ Lys Glu Lys Glu TVr S«: 

1377 ueS 13SS • 1*04 

XAC ACCI^ CGA Crc crc cy^AACAKATCmOAAafiWCWOTTaC^ 
to ^ Cly vli J;in K« iie ile du S« A^ A»P S*r Leu 

itii 1422 1<31 

GTC >GA^^T CAT 0=X 1^ ^ CTTT Vic AAA GAA ACO CTC ^ CIC CIC GOT «A 

^ vll 'iZ ciu i« 1^ Glu I*u I*u Gly Gly 

147S 1*85 1*9* 

Aa CCX cS CTrCAAA»CKCATOTCGCMCWCTO*aWGW*TC 

te^ tei Val Gta A^ He »it Clu Asp Leu Ars IVr Glu lie Val Scr 

1^51 1S30 1539 15<* 

TCT GGC^ CTC TCA J^CCTICAAOATCOTCTCGWOTAACi^AaGAO 

Ser Gly Thr Uu S« Gly Asn V<a Lys lie VM. val :.r; -i^;. . Olu 

1584 1593 IM2 1611 IS^" 

T»C 'OU: SI GAA AAA^ GGA AW TCC CTC A» MA 

Hii Ziiil Glu lyl Clu Gly Lyi S« sIr Uu Lys Ly. ixg Val Val Lys Aij 

tS19 1638 164"' 1*5S 1665 

GAA GAC Sa AGA AAC ttc tac ric TAC WA CAC 0^^ a« gaa tca 

Clu A^ oly aIh Phe Phe Clu Clu Gly Clu Aro Glu — 



Figure 10c (Continued) 
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9 18 27 36 45 54 

5' ATG GGG ATT GGT GGC GAC GAC TCC TGG AGC CCG TCA CTA TCG GCG GAA TTC CTT 

Mec GXy lift Gly Gly Asp Asp Ser Trp S«r Pro Ser Val Ser Ala Glu Phe Leu 

63 72 Bl 90 99 108 

TTA TTG ATC GTT GAG CTC TCT TTC GTT CTC TTT GCA ACT GAC GAG TTC GTG AAA 

Leu Leu II« Val Glu L«u Ser Phe Val Lau Ph« Ala Ser A^p Glu Phe Val Lys 

117 126 135 U4 153 162 

GTG GAA AAC GGA AAA TTC GCT CTG AAC GGA AAA GAA TTC AGA TTC ATT GGA AGC 

Val Glu Asn Gly Lys Phe Ala Leu Asn Gly Lys Glu Phe Arg Phe lie Gly Ser 

171 190 189 198 207 216 

AAC AAC TAC TAC ATG CAC TAC AAG AGC AAC GGA ATG ATA GAC ACT GTT CTG GAG 

.Asn Asn Tyr Tyr Mec His Tyr Lys Ser Asn Gly Mec lie A«p Ser Val Leu Glu 

225 234 243 252 261 270 

ACT CCC AGA GAC ATO GGT ATA AAG GTC CTC AGA ATC TGG GGT TTC CTC GAC GGG 

Ser Ala Arg Asp Met Gly lie Lys Val Leu Arg He. Trp Gly Phe Leu Asp Gly 

279 28B 297 306 315 324 

GAG AGT TXC TGC AGA GAC AAG AAC ACC TAC ATG CAT CCT GAG CCC GGT GTT TTC 

Glu Ser Tyr Cys Arg Asp Lye Xsn Thr Tyr Met His Pro Glu Pro Gly Val Phe 

333 342 351 360 369 378 

GGG CTG CCA GAA GGA ATA TCG AAC GCC CAG AGC GGT TTC GAA AGA CTC GAC TAC 

Gly Val Pro Glu Gly Il« Ser Asn Ala Gin Ser Gly Pho Glu Arg Leu Asp Tyr 

387 396 405 414 423 432 

ACA GTT GCG AAA GCG AAA GAA CTC GGT ATA AAA CTT GTC ATT GTT CTT GTG AAC 

Thr Val Ala Lys Ala Lys Glu Leu Gly He Lys Leu Val He Val Leu Val Asn 

441 450 459 468 477 486 

AAC TGG GAC GAC TTC GGT GGA ATC AAC CAG TAC GTG AGG TGG TTT GGA GGA ACC 

Asn Trp Asp Asp Phe Gly Gly Mec Asn Gin Tyr Val Arg Trp Phe Gly Gly Thr 

495 504 513 522 531 540 

CAT CAC GAC GAT TTC TAC ACA GAT GAC AAG ATC AAA GAA GAG TAC AAA AAG TAC 

His His Aap Asp Phe Tyr Arg Asp Glu Lya He Lys Glu Glu Tyr Lys Lys Tyr 

Figure llCu 
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Thttxsotoga m*r±tlM p-BAnaaaa** 



(contian.d) (6 O -(^^^ 



567 576 5fl5 

-!! !!! ^ ^! ^! ^ ^ac agg ci! 

V.l ser Phe Leu V.l Asn hII vll Zn 7^ 't^ oi'y vll 

612 621 630 fi-*o 

-!! !!! saa ccg ccc tc? g.c acg 

Glu Pro Thr He Met AI. Trp Glu L"u ^I'l gI« P« i;;; 'i';; ^ ~ 
666 675 684 fio-i 

^ ^ '^'^ "^^^^ «o tcc tac ata JSl 

Ly- S.r Gly A-a Thr L.« VX Glu vll 'l^ 'all Met S« Ser ^ 

730 729 738 747 

™ ™ ^ ^ =~ '"^ <^ TO TIC AGC lie 

Ser Leu Asp Pro Asn His Leu V.l Ala vll lly Zl gIu lly III HI IZ Zl 
765 774 783 792 901 

!-! !^ ™ !!! ^ ^ ^ ^ 'JJ aac ggc ^ 

Tyr Glu Gly Pho Ly. Pro Tyr Gly cly I'll III HI '^'^ ^ ™ 
"8 837 84S ass oca 

«f !!!!!!!!!!!! *~ rrc GGc aS 

Ser Gly V&l Asp Trp Ly. Lys Leu Leu Ser He Glu ^ vll 'Zl HI Hy 

851 900 909 01a 

!!! !" CAC TGG GGT GTC act CCA GAG AAC TAT GCC CAG TK 

Phe His L.U Tyr Pro Ser Hi, Trp Gly Val Ser Pro Glu Am ^ 'Zl HI 

"■^ "6 945 954 gg, 

!!! ^! !!! !^ *™ aag atc gca aaa gag ATC GGA AAA CCC 

Gly Ala Lys Trp He Glu Asp His II, Lys He Ala Lya Glu III Hy lH HI 
'81 990 999 1008 1017 

!!!!!!!!! !!!^ °" aga acg 2c 

Val Val Leu Giu Glu Tyr Gly He Pro Lys Ser All Pro vll 'Zl Irg HI III 

"** "53 1062 1071 loflo 

-!! !!! !!! "^"^ "^"^ '^'^ crc ggt gga gat gga gcc atc 

Ue Tyr Arg Leu Trp Asn Asp Leu Val Tyr Asp Leu Gly Gly HI ciy Zl HI 

Figure llb(Continued) 



wo 98/24799 



22/46 



PCT/US97/22623 



H »aaaaa««. (fl«7 (coatiaa.d) \C 

1098 1107 

p.. ... „. - -~ ~- -~ --- ... 

^^^^ 1152 ii<i 

J« ccc „c 3« ^ 

^ - - - n. .;: z ;~ - ~ 



1137 X206 1215 

™ ™ =^ ^ cc, ^ ™ „ ^ ^."J ^ 

... ^ „. ^„ ~; -~ -~ ... ... „. 

^-251 1250 12S9 

- « — ™ ^ 

s„ ~ ^; ~ --- --- ... ... 

^305 1314- 1323 

V.1 ^, ... ,ny ...^^^ ^ ~ ~ ~ ~- ~ --- --. ... 

2-359 1368 1377 -^^^ 

~ !^! !!!!!!!!!!"""« - ™ 

v.. v.1 ™. „; ~ ~ ~- --- ~ ~ --- ^ 

1413 1452 1431 

~ ^!! !!! ?f cx'^j; ^ 

-V »» ^. ™. ^, ~ - -~ ~ --. ... 



^ -'^ - - - - - ^ 

^ ^ -~ ^ --- ^ .„ ... 

1521 1530 1539 

~ !!! !^ — ™ «? ™ Tcc^S; cc 

„, ^ „. ~ --- ... _ „. ... 

1584 1593 

~ !!!!!! ™= "o'S ^ 

v.. L„ ^„ ^ ^ ^ ,„ ^ ~ --- ... ... ... „. „. 

Figure 11C>( Continued) 
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1629 163B ifiiT 

^ ^ «c ™ ^ ^ ^'JJJ ^. «s ^ ^ 

«. v.. M-; - ~ - - 

1683 1692 1701 

f!^ ^ ^ - - - - - ^ 
o.y.„ ^„ ^ - ;~ --^ ~ --. --- ... ^ ... 

ur .u c^ n. ^ ~ - ~ ~ ~ ~ --- --- --- 

=iy L», p., ^ „. ~ ~ ~ ~ ~ ~- --- .-- ... 

- A.„ ^. ~ ~ - --. ... 

1999 1908 ion t^-^ 

^ r« ^ ^ ™ CXT ^'JJI ^ ^'J^ ^ 

«u TVX «, ^ „. ^ ~ ~ ~ ~ - - ~ 

1953 1952 1971 100ft 

^ f!! !^ ::!^ °- - cxx^^ .^^S? 

i-y. ciu^u.. xi; iii: - ;^ z; ii; m 

ri. X.P Asn V.1 A,, i;;; ~; — --- --- --- - 



Figure lid (Continued) 
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1« P-««iao.ld«., (630B1) 



9 18 27 



f f?^ - - - ™ - - « =» ^ ccc ^ c« ^ ^ 

,„ ^„ - - --- --. ... ... ... 

..p ... ^, - - 



12fi lie 

v« «. ..p p., ^. „. ~ -.- --- ^ „. ... 

180 log 

c:v i>. «. „. ^ ^ ~ ~ ~ ~- --- ~ -.- ... 

234 

^. ^ ^ ^ ^, ~; ~- ~ -~ ^ ... ... „. 

279 288 

=2 !!! ^f! ^ ^ ^ ^ ^ 

p~ ^ ^ ^ ~; ~ ~ ~ - ~ ^- ~ .-- 

333 

"p a. ^ 3„ ^ ~ ~ 
^ !^ !? «i ™ «c SI ^ - 

«u v.. ^ry.r„^, ^, Z 17. -:Z ll: " ~ 

450 jcq 

"° ^ ;^ ;i; - - 

504 

v.. «. „, ~ ~ ~ .-- ... ... ... 

Figure 120- 
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1« P-MnBe*ldmi« (JJOBI) (coatiau.d) 

5" 558 5S7 57fi 535 ca. 

AGO ACA CTT CTT CAC TTT GCC AAG TAT CCT GCT TAC ATC GCC CAT GCG CTC ^ 

Arg Thr Val V-1 Glu Phe Al« Ly, T^;^ M." AU 't^ HI HI 'hH ^ 

"2 621 630 639 

™ !^ !!! acc ttc aac c;** CCT atg cta err cnxj gag 

A*p L«u Val Asp Thr Trp Ser Thr Ph. Asn Glu Pro Met vll vll vll gIu 

S66 675 684 653 70, 

!^ ™ !^ ^ !^ *™ CCC SAG «C 

Gly Tyr L«u Ala Pro Tyr Ser Gly Phe Pro Pro Gly Vai Mec tl'n Pro Glu A^a 
■ 711 720 729 738 747 75s 

™ "° aac ccc cac gcc rro gca tat aag atc 

Ala Lya Leu Ala II. Leu Asn K.e 11. Asa Ala Kis Ala Zeu lyl nil 

■'SS 774 783 792 801 810 

ATA AAG AGG TTC GAC ACC AAG AAG GCC GAT GAO GAT AGO AAG TCC CCT GCG CAC 

Il» Ly» Ars Ph. A«p Thr Lys Ly. Ala Asp Glu Asp S*r Lys Ser Pro 'tZ 

828 837 846 855 864 

GTT GGC ATA XIT TAC AAC AAC ATC GGT CTT GCC TAC CCT AAA GAC CCT AAC GAT 

Val Gly II. lie Tyr Asn Asn II. Gly Val Al. Tyr Pro Hp p^o IH 

882 891 900 909 918 

CCC AAG GAC GTT AAA GCA GCC GAA AAC GAC AAC TAC TTC CAC AOC GGA CTG TTC 
Pro Lys Asp Val Lys Ala Aia Glu Asn Asp Asn Tyr Phe His Ser Gly lIu Pfae 

936 94S 954 963 972 

TTT GAT GCC ATC CAC AAG GCT AAC CTC AAC ATA GAO TTC CyiC GGC GAA AAC TTP 
Phe Asp Ala lie His Lys Gly Lys Leu Asn II. Glu Phe Asp Gly Glu A^n Phe 

"° "9 1008 1017 i026 

CTA AAA GTT AGA CAC CTA AAA GGC AAT GAC TGG ATA GGC CTC AAC TAC TAC ACC 

Val Lys val Arg His Leu Lys Gly Asn Asp Trp II. Gly L.u xla Ty^ Ty^ Th^ 

1035 1044 1053 1062 1071 108O 

CGC GAC CTT GTT AGA TAT TCG GAG CCC AAO TTC CCA AGT ATA CCC CTC ATA TCC 

Arg Glu Val Val Arg Tyr Ser Glu Pro Lys Phe Pro Ser He P^o III 'ill Ser 
Figure 12b(Continued) 
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"---^o-i*". ("OBI, (co«ln«.„ 

10" 1098 no7 1,,^ 

™ «0 «C =^ CC, ^ - - « TCC ^ 

°" ^ ^; - ~ -■- 

^^^^ 1152 iifii 

«P v.. s„ ^„ „. ci; ~ - - ~ ~ ~- ~ 

A.P s„ a. v.> ^ ^ - ~ ~ ~ ... 

«y v.a M. ^ ,„ „. ^„ ^ ^ ~; ~; ~ ~ --- -~ .-. ... --- 

»«: n. «. n. «u ii; - ~ ~ - ~ 

^^^^ 1368 1377 

Trp AI. I.U Thr ..p x.„ ^ cau Txp III s« ii^ ~ 

1*13 1432 1431 

'-''^'^''^ 

- - - ™ ^ ^ISJ. 

lu >vx ^, ^, V.1 ^ ~j ~ ~ ~ ~ --- -.- 

153X 1530 1S3J 

SAG TTC CTG AAG OCT GAG C!AG AAA TCJl 3- 
Glu Phe Leu Lys Gly Glu Glu Lys 



Figure 12C(Continued) 
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0C1/4V n>doglttc«,«« {33QP1, 



^ Ifi 27 



v.. ^, ^ - ~ ~ --- ... ... „. 

™ ™ ~ „c ^' ^ ^ - 

- - - - - c. «; l~ -~ -r; ;;;; ^- --- ... 

^^"^ . 
Acc c« _^ „ =^ ^ ^ ^ ^ ^ ^ ^ 

s» ».t =i„ s., V.1 M. ^ - ~ ~ ~- ... -. „. 

- ™ S ^ ^ S ^ ^ 

^y. u,. ~ ~ - ~ ~ -.. ... ... „. ... 

~ ff! !!t •?? r S = « ^ - 

<^y «. 2 V.1 ^, ^ - - ~ ~ --- --- ... „. 

!!! ^! !!!!!! »~ cx, S « c„ - 

Ph. ... s„ v.. „, „. ~ ~ ~ - - ---- -. 

342 

~ ff^ is cr, ^ ^ ^ 2 

- p.. 2^ ..p n. „, ^ ,^ - -~ ~ ~ -- .- --. 

.cc ^ ^ 0^ JI^ ^ - «. ^ ^ ^ ^ 

M. =.„ ^„ ^ ^ - ... ... 

~ r^! ff? f?! ™ ™ =« ^ ^ 

uu ^ p„ ..p ^ ~ -- -- --- ... ... 

L„ ^ ,„ ~; ~ - -- --- --- ... „. „. 

Figure 13CL- 
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549 »«4o»l«e*na., (jjopi, (contlan.d) 

GAG CCT CCT CAO AAC ™ ACA °CT ^ AAA IIJ ccA Crr m CCA AAA ^ 
GIu Pro Ala Gin Asn Leu Thr ^l".' ci'u' HI ^^p" Zn L'u ly^ CII 

612 621 ff3o 

CTC AAA GTT ATC AGG GAG AGC AAT CCA ACC CGG Arr A^ CAT CCT CCA 

Leu Ly, v.i He Arg Gi^ s« a'i; z 'J, HI vii" III ™ 

666 675 £84 co-i 

^ !!! ^ ™ !!!! ^ !!! *f ctc HI oac aaa c2^ 

A.n Trp Ala His Tyr Scr A^I vll I^' ™ "^^ — 

711 720 729 738 -jai 

t?. *I! ^ «J ^ TAC GAA CCT AAA rrc ACA SIt CAC CGT ^C . 

II. lie V.1 Sex Ph. His Tyr Ty^ gIu Pro Phi lul ml 'cil ll~ 

''^ "* 792 Sfll 

™ !!!!!! ^! ^ GAG GAx 

Glu Trp Val Pro II. Pro Pro Val Z^g Vai 7rp gIy ll'u 

"8 83'' 846 855 

^ *!! ™ *!! ™^ ^^"^ AGT GAC TCO GCA AAG ^ 

Glu !!• Asa Gin II. Arg S.r Hi. Ph« lyl 't^ vll III 'f^ III Gl"n 

^^^! ™ !!* t!! !!!!!! GCT TAT TCA IaA GCA GAC Jtc 

Asn Asn Val Pro He Ph. Le« Gly Glu Phi Gly Ila sir Zyi III Zl III 

'^^ '36 945 954 oei o„ 

^- !^ ™ !!! !!! ^! *^ °CG GAA GAA riT S 

Xsp Scr Arg Val Ly, Trp Thr Gl'u vll Zl nil Z.I gI« III HI 

'^^ 9»9 1008 1017 ,n-,r 

!!! ™ ™! !!! ™! !!! ^ «A GGA TTT GGC ATA TAC GAT AGA^ 

Ph. Ser Tyr Ala Tyr Trp Glu Ph. Cy, All ciy HI gi^ HI ^ Zl Z'g 'r^ 

!!! ^! !!! !!! ^* cct ct« err ggc ggc aaa SIS 

S.r Gin Asn Trp II. Glu Pro L.u Zl Thr 'Zl vll vii oly Wy LyI Glu 



TAA 3- 



yigure 13b(Continuedl 
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«ee X.P ... --- ... ... ... ... 

v.. ... --- --- --- --- ... ... ... ^. 

"S 

ATX - - c« ^ ^ ii; ^ „^ JS „ ccc 

«. .-u =u o^, v.. ... „; ~- ~ ~ ~ ~ ~. ... 

™ !!! f^f - - »c S 0= ^ iil 

p.. M. c:, ^ ,~ - ~ ... ^ ^ 

225 234 

-f: !3! !:! ^f? ™ ^ IJJ ^ ^ - 
- v.. ™. ~ -- ~ ... „. ... 

^'^^ 288 

!!!!!! — ccc i^S ^ ^ ^ 

He Pro Val S«r Arg Val Glu Lvs Xl« III I" ZZ " 

Giu Lys Al. A«p Pro Thr Asp jup v.l Thr A.n 

333 

^ v.. ^ - ~ --. -~ ... --- ^. „. 

396 405 

~ ™ ™ - — - iS; S ^ - 

v.a ... ... „. ^- -;; ~ ~ ~ --- ^. ... ... 

*50 

»^ ^ ^ ^ - -;; - -~ ~ ... --- ... 

495 5Q4 ... 

X., PH. ^ - -;; ~ --. ... 

Figure 14*^ 
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545 ccn 

Mc CO* 0^ ccc T.C c.c 1^ 1^1 

Lys A„ Oly Glu Th. Glu Pro Tyr 'oll '.11 7.1 a's'^ hH Glu" Z;," H; 

621 

!!!!!!!!! !^ !!! r ^ «^ «=c G.T J" cc ^ ^ 

A3„ Oly V.1 Trp Glu Ma V»X V.I gIu' g"i"; Zl La': g"; '.ll III ^ ^ 
fififi 675 fisz 

^ ^ "J G^ c.. S ^3 

OVr Cln L.U GXu A,„ Tyr Oly Lys Z Z', 'rZ ^i;; 'vll Z HI 

711 720 729 ..^ 

™ ^! !!^! ^ f= =TO AAT CTT ^ .GG ACA 1^1 

Ala V.1 Tyr Al« A« A.„ Gin Cl^ S« 111 vll vL" i;!; ^1 j;!; 
774 783 792 

!!t !^ !^ !!!! ^ «^ A«: GAA GGA J GAA GAC L'g 

Pro Glu Gly Trp Glu A« A.p 'o^ HI "i" ~- ;;;; 

819 828 837 fiifi 

ATA ATC TAT ^ CAC ATA «G GAC A^ ACA GGA CTC GAA IJ^ TCC GGG 

lie II. Tyr Glu II. Hi. Ili Jll L'p III 'rZ lly IZ ^ 

873 882 891 Qon 

^ !!!!!! !^! "^'^ ^ ACG CGA cc ^ 

Ly. Asn Ly, Gly Um Tyr I^u Gi; all Z'n tZ Zl H; Zl 

927 S3fi g- 

!!!!!! !t! !!! *^ ^ GTT St 

Gly Val Thr Thr Gly L.u Ser hI^ Zl III III HI gI^ vli ;;;; Hi^ C^i hIs 
981 990 lOOfi 

™ !!!!!!:!!!!! !^! !!! !^ *~ rrc'Sc 

lie Leu Pro Ph. Phe A,p Phe ;;;; ;i; zi iii zi zi iii zi iz iz 

1035 1044 1053 10fi2 

AAG TAC TAC AAC ^ GOT TAC GAT CCT TAC TTC At.3 CXt'SS GAG CCc'Z 

Ly- Tyr Tyr A.„ Trp Cly ^ Zl Zl Zl Zl Zl Zl Zl oZ Z'y Z, 

Figure 141) (Continued) 
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. 

^^^^ 1098 nm 

T« ^C. ^ CCC ^ ^ ^ ^ ,^ ^ ^ 

'v. ^ ~ ~ --- ... ... ... ... ... 



X152 1K1 

CTC AAA GCC CTT CAC AAA CAC CGT AT^ Ger USfl 
: ^ A'^* CTG ATT ATG GAC ATC CTG TTC 

- - - - - c." ~ 

11S7 1206 13ie 

^ ^ r« ------ =0= c .cc's^ ^ „ - 

- ^ ^ n. 0. ^ ~ ~- ~ ~ ... ! ^ 

1251 1250 19*0 

^ *!! '^^^ ^ ^* CcJ TAT no'SJ ,3,^^ ^ 12.6 

^ - ^ AGC GGA TGT C3T AAC 

Pb« Tyr Arg lie Asp Lys Thr Glv ^ . " — 

ys Thr Gly Ala Tyr Leu A.n Clu Ser Gly cy, Gly Asn 
1305 1314 

- A^c =0. «c ccc ™ ^ ^ ^.^0 
V.1 n. ~- ~; ~ -~ --- .-- ... ^ ... .„ 

1359 1353 

""^ ^ ^ s i-; z:: 

1422 id-iT 

- ----- c„ 

n. ..p .y. ^„ - ~ - -~ ~ ~ ... ... ... 

^^^7 1475 i^Qc 

^C.A.A„C^TAC=.C.CC.^0CT«.^JS«.^^^ 

... ne^ne ^ - --- ... „. ... 

OCA AAO^^ ^ -J ^ ^XS« ^ ^^^^ 

.... AS. vax^A. - ™ - ... --- „, 

CAC GCA ATA AGG GOt"" GTO TTC^IIc CCG A«:'^ » "2° 

!!: "c aag gga ttc g«: atc gga 

Aap Ala He Arg Gly Ser Val . » 

Val Ph. A.„ Pro ser Val Ly. cly Phe Val Met Gly 

Figure 14<:( Continued) 
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1629 1638 1647 1656 1665 1674 

OCh TAG CCA AAG CAA ACC AAG ATC AAA AGG OCT GTT GTT CGA AGC ATA AAC TAC 

Gly Tyx Gly Lys Glu Thr Lys Ilo Lys Arg Gly Val Val Giy Ser Il« Asn Tyr 

1683 1692 1701 1710 1719 1728 

GAC CGA AAA CTC ATC AAA ACT TTC GCC CTT GAT CCA GAA GAA ACT ATA AAC TAC 

A3p Gly Lys Leu tie Lya Ser Phe Ala Leu Asp Pro Glu Glu Thr Xla Asn Tyr 

1737 1746 1755 1764 1773 1782 

GCA GCO TGT CAC GAC AAC CAC ACA CTG TCG GAC AAG AAC TAC CTT GCC GCC AAA 

Ala Ala Cys Bia Asp Aan His Thr Leu Trp Asp Lys Asn Tyr Leu Ala Ala Lys 

1791 1800 1809 1B18 1827 1836 

GOT GAT AAG AAA AAG GAA TCG ACC CAA GAA GAA CTG AAA AAC GCC CAG AAA CTG 

Ala Asp Lys Lys Lys Glu Trp Thr Glu Glu Glu Leu Lys Asn Ala Gin Lys Leu 

18*5 1854 1863 1872 1881 1890 

C-rr GOT GCG ATA CTT CTC ACT TCT CAA CCT GTT CCT TTC CTC CAC C-GA GGG CAG 

Ala Gly Ala II* Leu Leu Thr Ser Gin Gly Val Pro Phe Leu His Gly Gly Gin 

1899 1908 1917 1926 1935 1944 

GAC TTC TCC AOG ACO ACG AAT TTC AAC GAC AAC TCC TAC AAC GCC CCT ATC TCG 

Asp Ph* Cys Arg Thr Thr Asn Phe Asn Asp Asn Ser Tyr Asn Ala Pro lie s*r 

1953 1962 1971 1980 1989 1998 

ATA AAC GGC TTC GAT TAC GAA AGA AAA CTT CAG TTC ATA GAC GTG TTC AAT TAC 

He Asn Gly Phe Asp Tyr Glu Arg Lys Leu Gin Phe He Asp Val Phe Asn Tyr 

2007 2016 202s 2034 2043 2052 

CAC AAG GGT CTC ATA AAA CTC AGA AAA GAA CAC CCT GCT TTC AGG CTG AAA AAC 

His Lys Gly Leu lie Lys Leu Arg Lys Glu His Pro Ala Phe Arg Leu Lys . Asn 

2061 2070 2079 2088 2097 2106 

GCT GAA GAG ATC AAA AAA CAC CTG GAA TTT CTC CCC GGC GCG AGA AGA ATA CTT 

Ala Glu Glu He Lys Lys His Leu Glu Phe Leu Pro Gly Gly Arg Arg He Val 

2115 2124 2133 2142 2151 2160 

GCG TTC ATG CTT AAA GAC CAC GCA GGT GGT GAT CCC TGG AAA GAC ATC GTG GTG 

Ala Phe McC Leu Lys Asp His Ala Gly Gly Aap Pro Trp Lys Asp He Val Val 

Figure 14a( Continued) 
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Tr"" """" """ 

X.. ^ ^ - - ~ ... ... 

^- ^ ^"i -"^^ - - - ^ ^^.'Si 

- V. V. ~ ~ ... ... ... ... £ 

^ - ™ - ccc ^ cr.^'lS.^^,. 

o'.--. ... u„ ^ ^ ~ - ~ ~ --- ... 



Figure 14e( Continued) 
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^i9ure .5. r.e^oco.. ^.,ei.a MSBS (clone . Glycosidase 



1 



- - - - - - - - - o.c 

Val Glu Leu Ser Phe Val Leu Phe Ala Ser Asp Glu Phe 

GTG AAA GTG GAA AAC GGA AAA TTC GCT CTG AAC GGA n.. 
val Lys val Glu Asn Gly Lys Phe Ala T T ^= 
ly Lys Phe Ala Leu Asn Gly Lys Glu Phe Arg Phe 

Glv r """" ™^ -^'^ ^= TAC ATG CAT 

Oly Phe Leu ASP Gly Olu Ser Tyr Cys Arg Asp Lys As. Thr Ty. 

CCT OAG CCC GGT GTT rrc OGG GTC CCA CA.. GGA ATA TCG AAC GCC CAG AGC 
Pro Glu P.O Gly Val Pne Gly Val P.o Olu Gly Xle Se. Asn Ala Gin sZ 

Cly Zl T r '"'^ '^^^ CXC =GT ATA 

Cly Phe Olu Arg Leu Asp Tyr Thr Val Ala Lys Ala Lys Glu Leu Gly xle 

2 «TC ATT GTT C^ GTG AAC AAC TOG GAC GAC TTC GGT GGA ATG AAC 

Lys Leu val Xle Val Leu Val Asn Asn Trp Asp Asp Phe Gly Gly 

CAG TAC GTG AGG TGG TTT GGA GGA ACC CAT CAC GAC GAT TTC TAC AG. on. 
0.n TVr val A.g Trp Phe Gly Gly Th. H.s His Asp As^ Zl Z Zl 

Z 7s Ue r T T ^ CAT 

Lys lie Lys Glu Glu Tyr Lys Lys Tyr Val Ser Phe Leu Val Asn His 

Tyr Thr Gly Val Pro Tyr Arg Glu Glu Pro Thr He Met Ala 

TGG GAG CTT GCA AAC GAA CCG CGC TGT GAG ACG GAC AAA TCG GGG AAr .nr 
Trp Glu Len Rl=. TV ACG 
Leu Ala Asn Glu Pro Arg Cys Glu Thr Asp Lys Ser Gly Asn Thr 
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CTC GTt GAG TGG GTG AAG GAG ATG AGC TCC Tin .^n 

rr, v« z z z n. r r °" 

Ayr lie Lys Ser Leu Asp Pro 

AAC CAC CTC GTG GCT GTG GGG GAC GAA GGA TTC TTC AGC AAC r.r 
Asn His Leu Val Ala Val ri„ n , 

Ala val Gly Asp Glu Gly Phe Phe Ser Asn Tyr Glu Gly 

TTC ^ CCT TAC GGT GGA GAA GCC GAG TGG GCC TAC AAC GGC TOO TCC GGX 
Phe .ys P.0 Gly Oly Glu Ala Glu T.p Ala Tyr A.n Gly T^ sH aTy 

Orr GAC TGO AAO AAG CTC CTT TCG ATA GAG ACG GTG GAC TTC GGC ACG TTC 
-1 ASP Trp .ys .y. .eu .eu Ser Xle Glu TKr Val Asp P.e Gly tL 

CAC CTC TAT CCG TCC CAC TGG GGT GTC AGT CCA GAG AAC TAT GCC CAG TGG 
Hxs .eu TVr Pro Ser His Trp Gly Val Ser Pro Glu Asn Tyr Ala 7^ 

OGA GCG AAG TGG ATA GAA GAC CAC ATA AAG ATC GCA AAA GAG ATC OGA AAA 
<^ly Ala .ys Trp Xle Glu Asp His Xle .ys Xle Ala .ys Glu Xle oTy Z 

CCC GTT GTT CTG GAA GAA TAT GGA ATT CCA AAG AGT GCG CCA GTT AA. .r. 
P« Val val .eu Glu Glu Tyr Oly Xle Pro .ys Ser AU Z Va" a^ Z 

ACG GCC ATC TAC AGA CTC TGG AAC GAT CTG GTC TAC GAT CTC GGT GGA GAT 
Thr Ala Xle ryr Ar. .eu Trp Asn Asp .eu Val Tyr Asp .eu Gly Gly Asp 

Oly «e. Phe Trp Met Leu Ala Gly Xle Gly Glu Gly Ser Asp Arg Asp 

GAG AGA GGG TAC TAT CCG GAC TAC GAC GGT TTC AGA ATA GTG AAC GAC GAC 
AT, Gly Tyr Tyr Pro Asp Tyr Asp Gly PHe Ar. Xle Val Asn Z A^P 

sZ ii: 2 r oT r - - - 

i>er Pro Glu Ala Glu Leu He Arg Glu Tvr Ala Lv« r,u 

y vjiu iyr Aia Lys Leu Phe Asn Thr Gly 

GAA GAC ATA AGA GAA GAC ACC TGC TCT TTC ATC CTT rr^ .n. 

- n. ^ - - - - - coc 

GAG ATC AAA AAG ACC GTG GAA GTG 



AGG GCT GGT GTT TTC GAC TAC AGC 



AAC 



Figure 15b (continued) 
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Glu lie Lys Ly. Thr Val Glu V.l Arg Ala Gly Val Phe Asp Tyr Ser Asn 

ACG ITT GAA AAG TTG TCT GTC AAA GTC GAA GAT CTG GTT ITT GAA AAT GAG 

Thr Phe Glu Lys Leu Ser Val Lys Val Glu Asp Leu Val Phe Glu Asn Glu 

ATA GAG CAT CTC GGA TAC GGA ATT TAG GGC TTT GAT CTC GAC ACA ACC CGG 
He Glu His Leu Gly Tyr Gly lie Tyr Gly Phe Asp Leu Asp Thr Thr Arg 

ATC CCG GAT GGA GAA CAT GAA ATG TTC CTT GAA GGC CAC TTT CAG GGA AAA 
He Pro Asp Gly Glu His Glu Met Phe Leu Glu Gly His Phe Gin Gly Lys 

ACG GTG AAA GAC TCT ATC AAA GCG AAA GTG GTG AAC GAA GCA CGG TAC GTG 
Thr Val Lys Asp Ser He Lys Ala Lys Val Val Asa Glu Ala Arg Tyr Val 

CTC GCA GAG GAA GTT GAT TTT TCC TCT CCA GAA GAG GTG AAA AAC TGG TGG 
Leu Ala Glu Glu Val Asp Phe Ser Ser Pro Glu Glu Val Lys Asn Trp Trp 

AAC AGC GGA ACC TGG CAG GCA GAG TTC CGG TCA CCT GAC ATT GAA TGG AAC 
Asn ser Gly Thr Trp Gin Ala Glu Phe Gly Ser Pro Asp lie Glu Trp Asn 

GGT GAG GTG GGA AAT GGA GCA CTG CAG CTG AAC GTG AAA CTG CCC GGA AAG 
Gly Glu val Gly Asn Gly Ala Leu Gin Leu Asn Val Lys Leu Pro Gly Lys 

AGC GAC TGG GAA GAA GTG AGA GTA GCA AGG AAG TTC GAA AGA CTC TCA GAA 
ser Asp Trp Glu Glu Val Arg Val Ala Arg Lys Phe Glu Arg Leu Ser Glu 

TGT GAG ATC CTC GAG TAC GAC ATC TAC ATT CCA AAC GTC GAG GGA CTC AAG 
Cys Glu lie Leu Glu Tyr Asp lie Tyr lie Pro Asn Val Glu Gly Leu Lys 

GGA AGO TTG AGG CCG TAC GCG GTT CTG AAC CCC GGC TGG GTG AAG ATA GGC 
Gly Arg Leu Arg Pro Tyr Ala Val Leu Asn Pro Gly Trp Val Lys He Gly 

CTC GAC ATG AAC AAC GCG AAC GTG GAA AGT GCG GAG ATC ATC ACT TTC GGC 
Leu Asp Met Asn Asn Ala Asn Val Glu Ser Ala Glu He lie Thr Phe Gly 

GGA AAA GAG TAC AGA AGA TTC CAT GTA AGA ATT GAG TTC GAC AGA ACA GCG 
Gly Lys Glu Tyr Arg Arg Phe His Val Arg He Glu Phe Asp Arg Thr Ala 



Figure l5C(continued) 
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Z r T XAC CAT 

Oly Val aiu .eu His Xie- Cly VaX Val O.y Asp His .eu Arg Ty. Asp 

GOA CCG ATT TTC ATC .AT AAT GTG AGA CTT TAT AAA AGA ACA GGA OGT ATG 
Oly Pro Xle Phe Xle Asp Asn Val A., .eu ryr .ys Ar. T.. Cly cly 

TGA 1991 
END 



Figure 1 5 a( con tinned) 
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Figure No. l^Thermotoga maritima MSB8(6gb4) 
1 ATG AAA AGA ATC GAC CTG AAT GGT TTC TGG AGC GTT ACG GAT A.r n.. 

1 Met Lys Arc re Aed Lpu ao u ^ ^ '^^^ 

9 I.e Asp Leu Asn Gly Phe Trp Ser Val Arg Asp Asn Glu Gly Arg Phe Ser 



60 
20 



^^^^z^z-zzzzzz-zzz-- z 

z r r - - c.o x.c ^ «o 

tys val Tyr He Lys Ser Pre He Arg Val Pro Lys Thr Leu Glu Gin Asn Tyr Gly , 14C 

Z r "'^ ^ C^G TAT rCG TAC 480 

1- VaX .eu Gly Gly Pro Clu Asp Pro Xle Ar. Gly .yr lie Arg .ys Ala Gin Tyr sTr Z Z 

Z Z T r °" ^" ^ °- S40 

Oly Trp ASP Trp Gly Ala Ar. He Val T.r Ser Gly He Trp Lys Pro Val Tyr l.u Glu 180 

m "l Tyr A^ T r T °" "° ^" °- "° 

Tyr Arg Ala Arg Leu Gin Asp Ser Thr Ala Tyr Leu Leu Glu Leu Glu Gly Lys A,p 200 

Zl r r '™ ""^ === °^ GTG GAA GTT TAT 660 

"X Ala Leu val Arg Val Asn Gly Phe Val His Gly Glu Oly Asn Leu Xle Val Glu Va" Z Ho 

vll 21 gT T --'^ CTC TTC 7.0 

Asn Gly Glu Lys Xle Gly Glu Phe Pro Val Leu Glu Lys Asn Cly Glu Lys Leu Phe 2.0 

721 GAT GGA GTG TTC CAC CTG AAA. GAT GTG AAA CTA TGG TAT COG TGG AAr r.n r- 

241 Asp Glv Val Ph« «i . , , AAA CCG 780 

P Gly val Phe „.s Leu Lys Asp Val Lys Leu Trp Tyr Pro Trp Asn Val Gly Lys Pro 260 
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781 TAC CTG TAC GAT rrn nmny 

^ GTT TTC GTG TTG AAA GAP '^a ^ 

841 AAG AAA ATC GGT TTG AGA AGA GTC AGA ATC G"T CAr r. 

- - - ^« v.. z 2 r z z z z -z z 

901 TTC ATA TTC GAA ATC AAC GGT GAG AAA GTC TTC GC^ ..r r- 

- - ... 7: - - - - - - - „. 

951 GAA AAC ATC CTC ACG TGG TTG AAG GAr r^^ r^.^ 

^ y ^iy lie Tyr Glu Arg Glu He Phe 3so 

z z z z: z t: z z z z ™ r r - - - — » 

y Met val Trp Gin Asp Phe Met Tyr Ala Cvs Leu 3 80 

eu irp eye Gly Asn Aan Glu Asn Asn 420 
y vai Asp ^ly lie Asn Leu Gly Asn 440 

^^^^^zzzzzzzzzzzzzzz 
^^^--zz-zzzzzzzzz z ■ 

1*41 GTC TGG TAC GTG TGO ACT GGC TGG ATG AAC TAC GAA ..r ..r. 
"1 Val Trp Tyr Val Trp Ser Gly Trp J T Z ^ '^^^ «^ "0° 

Oly Trp „ec Asn Tyr Glu Asn Tyr Glu Lys Asp Thr Gly Arg SOO 

z - 2 2 2 z z z z 2 tT t - - - - - 

HIS Pro Glu Thr He Glu Phe Phe Ser 520 
"61 AAA CCC GAG GAA AGA GAG ATA TTC CAT CCC GTC ^rr r. 

-r. ^ „. z z z z z z z z r ? 

i-eu Lys His Asn Lys Gin Val Glu 540 
Figure 16b(continued) 
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3 580 

.800 
600 



1621 GGA CAG GAA AGA TTO ATC AGO TTC ATA TTC GGA ^ 

y Arg Pro Lys Ala Leu Tyr Tyr Tyr 620 

--^^--zzzzzzzzzzzzz z 
---^--zzzzzzzzzzzz z 



2041 TGT GAG TTT GGT TGA 2055 
sai Cys Glu Phe Qly End 68S 



Figure 166(continued) 



so 

20 
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Figure No. l^..Bankia gouldi (37gp4) 
1 ATG AAA AAA AAT CTA CTA ATG TTT AAA AGG CTT Arr 

1 Met Lys Lyn Asn Leu Leu M.r ^ ^""'^ CCT TTG TTT TTA ATG CTG 

YB Leu Leu Met Phe Lys Arg Leu Thr Tyr Leu Pro Leu Phe Leu Met Leu 

61 CTC TCA CTA AGT TCA GTA GCT CAA TCT CCT GTA Pivii ^ 

- - - - 2 z z z: 2 - 

121 OCA MC CQC ATT CTT AAT GCG TCT CGA CAA ATT ACO AGC TTA GCT GGT AAC AGC CTC TXT 

- 0.y .s„ A.3 XXe A3n Ala Se. GX. Glu Xle T.. s« .eu AU GX, a" s« ^^.^ 

lai TGG AGT AAT GCT GGA GAC ACC TCC GAT TTT TAT AAT GCA GAA ACT GIT GAT ^ 

- Se. A. AXa GX, A.P T. se. A-p T.. A. AXa G^ IZ Z Z ^ 2 

Trp ser S.r He Arg He Ala Mee Gly Val Lys Glu Asn Trp Asp Gly loo 
301 GGA AAT GGC TAT ATT GAT AGT CCG CAG GAS C^i GAA GCT >Ai »tt .n. 

-01 Gly Asn Glv T, » ^""^ <5CT AAA ATT AGA AAA GTT ATT GAT 360 

C-y Gly Tyr He Asp Se. Pro Gin Glu Gin Glu Ala Lys He Arg Lys Val 11= Asp l.o 

III m" x" aT T "'^ CAC ACT CAC GAA GCA GAG TTA .20 

izx Aia Ala Xle Ala Asn Gly lie Tvr Val n« n- n. ... 

y ryr val lie Xle Asp Trp His Thr His Glu Ala Glu Leu 140 

III Z 2 Z T 7: T - - - - ATT GTA .00 

Tyr Ala Glu Gin Val lie Ala Gly Xle Arg Ser Lys Asp Pro Asp Asn Leu xle Xle Val .00 

" "vTS":A'::rrr""="°""'^°"^-^------ACT .0 

y ser Asn Tyr Ser Gin cXn Val Asp val Ala Ser Ala Asp Pro Xle Ser Asp Thr »0 

III z z z: z r r ^ - ™ - - -0 

Ala Tyr Thr Leu H.s Phe Tyr Ala Ala Phe Asn Pro His Asp Asn Leu Arg Asn ..o 
'21 GTA GCA CAG ACA GCA TTA GAT AAT AAT GTT GCT TTG TTT GTT Ar. r.. . 

241 Val Ala Gin Thr A1 , T . . ' """^ "° TTT GTT ACA GAA TGG GGT ACA ATT 780 

Thr Ala Leu Asp Asn Asn Val Ala Leu Phe Val Thr Glu Trp Gly Thr lie 260 
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/ i-eu iie ser Asn Lya Leu Thr Ala 320 
SSI SAA ATT OTA AAA AAC ATC ATC CAA AAC TGC; r.T ... 

- - - n. ... - 2 z 7. z z 'z 

Arg Aia Ala Mec Glu Thr Ala Gin Ala 3so 

1081 GGA GAT GAA ATT ATA ATT GCC CCT SGA i»r n-.r 
3" Oly ASP Glu Xle lie lie Al, p" Ty aT Z ^ ^ 

Pro Gly Asn Tyr Asn Phe Gin Asp Lyo He Gin Gly Ala 380 

Tyr ryr Gly ser Ala Asn Gly Asn Ser Thr Asn Pro lie lie 400 
"01 TTA AGA GGC GAA AGC GCT ACA AAC CCT CCT GTT TTC TCA cr. ^ 

- - 0. „. J - - z z z z z 'z 

12S1 TAC CTA TTA AGT ATT GAA GGT GAT TAT TOT 

^--^'^''-'^.^.^zzzzzzzzzzzzz 

^'-^^'^rzZZZZZZZZZZZZZ-- 

y rae (^ly Glu Gly Leu Tyr Val Gly soo 

-^^^^^zzzzzzzzzzzzzz - 

Figure 17b (continued) 
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ASP Phe uu ASP Arg Gly Thr Oly Phe Asa Thr sco 
"81 TGC CCA GAT TGG AAT ATA GAA CCA TGT AAT CCT GTA r.r r.. 

- ^ ... n. ... ,^ :» z: 2 z z 2 T. z r z 
-^^^^--—zzzzzzzzzz z 

2221 ACA GAT GAA CTT AAT GGT CTT ACA GAA GGA ACT TAT ACC T,-. . 

- ^ ^ »' - - - - - - ^ ... 

iiA AGT AAT TTA AAA ACA 2460 

Figure 1 7a,( continued) 
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... .v. „^ 

=«> m ,rr «i tM „T ra c„ iac «t oo<= m mr ciu> ^ 
- - - n. ,^ ^„ J - - - - 

^ Tvr n. u. L„ ,„ „. 
«" SCA Mr CCA <iM »rA TCT ATt AOC AAI AOC tlA ATt CCT ..r ^ 

... ... c. :.. ... :z z z z z z z z z 

»er i.y3 Thr Asn Asn Phe Thr He Tyr soo 
2761 ATT ACT GAT GAT TCT AGT ATT AAT TTT AAG CTT -^c C- ^ nn. 

- - - A3. se. A. - z - - - 2 - 

Val Ser Ala Olu Asp Olu Ly, Leu a1. Leu Val L.u Val Pro sgj 



Figure 17a(continued) 
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figure No. la^ Pyrococcus furioeus vci(7EGl) 
leader sequence: amino acids 1-24 

" m" sT r r "° '^"^ -A OTA Jo 

Met ser Lvs L,s Phe V.: xie V.X Ser Zle Uu Xhr Xle .eu ..u VaX 2 



AAT 



CCA ATA TAT TTT OTA OAA AAO TAT CAT ACC TCT GAO OAC AAO TCA ACT TCA 
AU Xle T^ P.e VaZ OZu .ys ryr His T.. Ser OZu Asp S« T." Se" Asn 

135 144 ,5, 

ACC XCA TCT ACA CCA CCC CAA ACA ACA CTT TCC ACT ACC AAO CTT CTC AAO A^ 

Thr se. ser Thr Pro Pro oin Thr T.r Leu Ser T.r THr .ys Va. x.e. .^s 

"° "8 

2 z III r a"' - - 

Arg Tyr Pro Asp Asp oly Glu Trp Pro Glv Ala Pro tt - • 

H riro ijiy Ala Pro lie Asp Lys Asp oly Asp 



"5 234 243 252 261 



OGO AAC CCA OAA TTC TAC ATT OAA ATA AAC CTA TOO AAC ATT AAT OCT A^ 

Oly Asn pro Olu Phe Tyr Xle Olu Xle Asn .eu Trp Asn zle Leu Asn Ala T 



2B8 297 306 3^5 



Glv pT r ™ '^'^ ^ 

Gly Phe Ala Olu Met Thr Tyr Asn Leu Thr Ser Oly Val Leu His Tyr Val Oln 

CAA CTT OAC AAC ATT OTC TTO AGO G^^ AOA AOT ^T TOG OTO CA^ OOA TAC Zl 
Om Leu ASP Asn Xle Val Leu Arg Asp Ar. Ser Asn Trp Val His oly ^ p" 



"S 405 414 423 



OAA ATA TTC TAT OOA AAC AAO CCA TOO AAT OCA AAC TAC OCA Z OAT GGC C^ 
«lu Xle Phe Tyr Oly Asn Lys Pro Trp Asn Ala Asn Tyr Ala Thr Asp Oly Pro 

ATA CCA TTA CCC AOT Z GTT TCA Z CTA ACA G^C TTC TAT CtI ACA ATC TCC 
lie Pro Leu Pro Ser Lys Val Ser Asn Leu Thr Asp Phe Tyr Leu ^h" Ue s" 
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S04 

TAT AAA CTT GAG CCC AAG AAC GGC CTG CCA ATT AAC r.r 

- G. ... - - - - - OAA T. .0 

TTA ACG AGA GAA GOT TGG AGA ACA ACA GGA « 

^eu r.r Ar. GXu AXa T.p ^-'^ ^TA 

9 -ftr Th. Gly He Asn Ser Asp Glu Gin Glu Val 

S03 

ATG ATA TGG ATT TAC Tar r's,- 64 8 

- - :^ z ™ - - - z - - - 2 

675 684 
ATT GTA GTC CCA ATA ATA GTT AAC GGA ACA CCA GTA AAr . 

He Val val Pro He l]« v i „ ^ °" ^^^^ ^ GAA GTA 

He lie val Asn Gly Thr Pro Val Asn Ala Thr Phe Glu Val 

TGG AAG GCA AAC ATT GOT TGG GAG TAT GTT GCA ^ AGA ATA 111 
.ys Ala A. ne Gly Trp Gl. .r Val Ala P^ 

7" 792 

AAA GAG GGA ACA GTG ACA ATT CCA TAC GGA GCA TTT ATA AGT Z 
X.y3 Glu Gly Thr val Thr He Pro Tyr Gly Ala p"e Ue T ""^ 

Y uxy Aia Phe He Ser Val Ala Ala Asn 

828 837 846 

ATT TCA AGC TTA CCA AAT TAC ACA GAA CTT TAC TTA GAG GA. Zl 
ne ser Ser .e. Pro A. Tyr Thr Gl. Tyr 2 Z Z Z Z 

"2 891 
ACT GAG TTT GGA ACG CCA AGC ACT ACC TCC GCC CAr r-r. 
T.r Glu Phe Cly Thr Pro Ser Thr Th Zl gT T 

i^er Ala His Leu Glu Trp Trp He xhr 

^"^^ 936 945 

AAC ATA ACA CTA ACT CCT CTA GAT AGA CCT CTT ^ TCC TAA 3 ■ 
A.n He Thr Leu Thr Pro Leu Asp Arg Pro Leu He Ser T 



Figure leb(continued) 
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